Need some help from the forums for an experiment in Machine Learning for grading comics.

2 2

RhialtoTheMarvellous · March 2, 2021

Grade a comic using machine learning
The tooling for building out Machine Learning models has become fairly easy to use and I wanted to see if I could build a model that can grade a comic. I've got the tooling to create the model, but what I need are images.

As a start I wanted to focus on one particular comic book. The reason I only want to use one book is because of the evaluation criteria. The idea is that we want to be able to classify a book into a particular score. So the criteria is much more narrow than for instance if we were going to recognize whether a picture has a comic book in it. That case would be relatively easy to model.

I was thinking ASM #300 would probably be something people could dig up easily in various grades. Granted we won't probably see any 2.0s, but we can probably get a decent sample of images from 7.0 to 9.8.

To be clear, this is not a replacement for the CGC service and I have no intent to make any of this into a pay to play product. I honestly have no idea whether it will work or not due to the image sample requirements and it's difficult to create something like this just for one comic book much less the millions of them that exist out there. Given the variety of defects possible I will likely need many samples at each score level to get something that can discern accurately between a 9.4 and a 9.8.

What I need

Raw front and back cover scans of ASM #300. One of each per copy. I must have both the front and back cover.
- Obviously this could be difficult to find since people don't always scan a comic front and back before getting it graded.
The grade assigned to those front/back scans as a number (9.8, 9.7, 2.0, etc), no text grades please.

Image caveats

No cropping of edges, the comic front and back cover must be fully visible.
No partial images (ie corner shots, close ups of defects etc).
The defects that cause the grade issues must be visible in the scan.
- This won't work for interior page defects. Keeping it simple for now.
- It won't work for books already in a holder.
Try to keep them reasonably sized.
JPG or PNG only.

The more scans at each grade that I can obtain the better the model will end up being.

If it ends up that it has decent accuracy I'll make the model publicly available and put up a web page where you can submit your own images of ASM 300 for grading. If anything it might be a halfway decent pre-screen.

theCapraAegagrus · March 2, 2021

So, you want raw scans and grades from different individuals, and you want to use these subjective pieces of data to educate a machine?

Let me know if that clicks.

RhialtoTheMarvellous · March 2, 2021

11 minutes ago, Angel of Death said:

So, you want raw scans and grades from different individuals, and you want to use these subjective pieces of data to educate a machine?

Let me know if that clicks.

Ideally it would be the raw scans of graded books. Which is why it is a tall ask. Grading is somewhat subjective no matter how it is done, but overall if we are looking to replicate the CGC grading process then the actual grade should come from CGC.

If that's mission impossible then we could just take books scored by users and train the model using that, but obviously the results would be more reflective of a result you might get in "Hey buddy can you spare a grade."

That said, even with user data this could still have value. The result you get from the model is not just one score. It evaluates the image and gives you multiple categories that it could fall into with a percentage value for the certainty of each with the highest score being the primary result. Ideally if the model is working well you get a certainty of 90+% on that primary result.

theCapraAegagrus · March 2, 2021

1 minute ago, RhialtoTheMarvellous said:

Ideally it would be the raw scans of graded books. Which is why it is a tall ask. Grading is somewhat subjective no matter how it is done, but overall if we are looking to replicate the CGC grading process then the actual grade should come from CGC.

If that's mission impossible then we could just take books scored by users and train the model using that, but obviously the results would be more reflective of a result you might get in "Hey buddy can you spare a grade."

That said, even with user data this could still have value. The result you get from the model is not just one score. It evaluates the image and gives you multiple categories that it could fall into with a percentage value for the certainty of each with the highest score being the primary result. Ideally if the model is working well you get a certainty of 90+% on that primary result.

The bold is not possible. If that's the endgame, then it is in fact "mission impossible".

RhialtoTheMarvellous · March 2, 2021

1 minute ago, Angel of Death said:

The bold is not possible. If that's the endgame, then it is in fact "mission impossible".

It's not exactly that. I should have said if we want it closer to the CGC scoring then the scoring should come from CGC.

The criteria for scoring is based on visual data. Comic defects are visual. If you aggregate enough information into the model in each classification then the model will be able to discern the differences.

theCapraAegagrus · March 2, 2021

1 minute ago, RhialtoTheMarvellous said:

It's not exactly that. I should have said if we want it closer to the CGC scoring then the scoring should come from CGC.

The criteria for scoring is based on visual data. Comic defects are visual. If you aggregate enough information into the model in each classification then the model will be able to discern the differences.

I'm pretty sure that we already had this discussion a year ago. A machine will not be able to make objective distinctions from one comic to another using only cover photos.

How is the machine going to identify a Gem Mint?

THE_BEYONDER · March 2, 2021

Unfortunately, NCB creases/bends don’t show up in scans.

I suppose your machine could give post-pressing results.

RhialtoTheMarvellous · March 2, 2021

54 minutes ago, THE_BEYONDER said:

Unfortunately, NCB creases/bends don’t show up in scans.

I suppose your machine could give post-pressing results.

I probably emphasized the visual too much in my post. The machine can see things in the image that aren't necessarily visible to the human eye as it breaks down the image using algorithms that I can only pretend to understand. But, I would go with the more detailed the image the better.

54 minutes ago, Angel of Death said:

I'm pretty sure that we already had this discussion a year ago. A machine will not be able to make objective distinctions from one comic to another using only cover photos.

How is the machine going to identify a Gem Mint?

Let's consider the problem. How does a machine evaluate whether a cat is present in a picture versus a dog. This is a solved problem btw.

You collect a lot of images of cats and put them into the cat category and then a lot of images of dogs and put them into the dog category and let a convolutional neural network break down the images in category for patterns.

https://developers.google.com/machine-learning/practica/image-classification/convolutional-neural-networks

The stage I'm at is gathering the dataset to see if I can feed the ML algorithm something it can use.

For the gem mint problem, a machine evaluates the image based on the criteria it is given. The two criteria in this case are the score and the image. The image is evaluated by the machine at multiple different levels meaning there are a bunch of criteria we don't necessarily understand that it is using to evaluate the image. This is the black box of the model. If you put enough data into the 10.0 category alongside the data from 9.9 and 9.8 then it will be able to discern the differences because it breaks down the image along lines that aren't purely visual.

That said, since there are few examples of 10.0 or 9.9 then the limitation will be that you cannot rely on the machine to give you an accurate result in that regard, just like for any modern comic you could not rely on it to tell you something is a 2.0 since there aren't many examples of that. These are admitted limitations and something I already acknowledged in the original post.

I still would like to try to gather some data in this regard, but you are making me think about this problem overall and what sort of other things we could do. Rather than go more broad with the scoring criteria which confines us to a particular comic and the differences between that comic in each category we could try to identify various visual defects. For instance, spine ticks. Comics with and without visual spine ticks could be identified if we get enough examples.

Edited March 2, 2021 by RhialtoTheMarvellous
Clarify some things.

D84 · March 2, 2021

This is fascinating. I'm very curious to see the results. Unfortunately, I'm not a Spider-man collector, so I don't have any images to.send you.

THE_BEYONDER · March 2, 2021

If the machine can “see” NCB bends in a scan....

Could it potentially still see NCB bends/creases that have been pressed out?

Mike Bray · March 2, 2021

tree fiddy .... unless its the Vision .....

RhialtoTheMarvellous · March 2, 2021

Just now, THE_BEYONDER said:

If the machine can “see” NCB bends in a scan....

Could it potentially still see NCB bends/creases that have been pressed out?

Maybe? That's why I want to experiment.

speedcake · March 2, 2021

how will the machine grade the book's interior, staple quality, page quality, structural integrity (water damage, etc), check for restoration, etc?

I don't have any scans of ASM 300 to provide in your quest, sorry.

theCapraAegagrus · March 2, 2021

1 hour ago, RhialtoTheMarvellous said:

I probably emphasized the visual too much in my post. The machine can see things in the image that aren't necessarily visible to the human eye as it breaks down the image using algorithms that I can only pretend to understand. But, I would go with the more detailed the image the better.

Let's consider the problem. How does a machine evaluate whether a cat is present in a picture versus a dog. This is a solved problem btw.

You collect a lot of images of cats and put them into the cat category and then a lot of images of dogs and put them into the dog category and let a convolutional neural network break down the images in category for patterns.

https://developers.google.com/machine-learning/practica/image-classification/convolutional-neural-networks

The stage I'm at is gathering the dataset to see if I can feed the ML algorithm something it can use.

For the gem mint problem, a machine evaluates the image based on the criteria it is given. The two criteria in this case are the score and the image. The image is evaluated by the machine at multiple different levels meaning there are a bunch of criteria we don't necessarily understand that it is using to evaluate the image. This is the black box of the model. If you put enough data into the 10.0 category alongside the data from 9.9 and 9.8 then it will be able to discern the differences because it breaks down the image along lines that aren't purely visual.

That said, since there are few examples of 10.0 or 9.9 then the limitation will be that you cannot rely on the machine to give you an accurate result in that regard, just like for any modern comic you could not rely on it to tell you something is a 2.0 since there aren't many examples of that. These are admitted limitations and something I already acknowledged in the original post.

I still would like to try to gather some data in this regard, but you are making me think about this problem overall and what sort of other things we could do. Rather than go more broad with the scoring criteria which confines us to a particular comic and the differences between that comic in each category we could try to identify various visual defects. For instance, spine ticks. Comics with and without visual spine ticks could be identified if we get enough examples.

Have you heard of false equivalence?

RhialtoTheMarvellous · March 2, 2021

1 minute ago, speedcake said:

how will the machine grade the book's interior, staple quality, page quality, structural integrity (water damage, etc), check for restoration, etc?

I don't have any scans of ASM 300 to provide in your quest, sorry.

As I noted in the OP the interior stuff isn't going to be accounted for in this particular scenario, though in many cases staple rust is visible on cover areas. Given the parameters of this experiment you could fool the thing any number of ways even if the model is well trained. If you have a mint looking set of covers and half the interior pages are cut out then it won't likely be able to tell unless there is some definitive difference in the images given to it.

I'm really just trying to keep it simple at first to see if I could compose some sort of model, but it was kind of a long shot asking for submissions on here anyway.

I've already downloaded all of the images from the CGC Registry and built a model around that, but the images are a bit too spread out and varied in quality for it to get any sort of macro accuracy (ability to select across categories) and they are mostly all in holders which limits the visibility.

That said, it's much more likely that we could build a model around one of these attributes you've pointed out.

For instance page quality. Scan a sample interior page section from any book and have it discern the differences between white and off-white for page color.

Or staple quality, I'm not sure if there is a metric around that or if they are just good or bad. If the metric was rusty or not rusty I'm sure we could easily have it discern the difference, but at the same time that is probably an easy one for a human to eyeball.

RhialtoTheMarvellous · March 2, 2021

4 minutes ago, Angel of Death said:

Have you heard of false equivalence?

In what way do you find this comparison flawed?

What attributes does the machine build up in the model from a few thousand images of dogs and cats which then allows it to take a completely new image and assign a value of dog or cat to it?

theCapraAegagrus · March 2, 2021

2 minutes ago, RhialtoTheMarvellous said:

In what way do you find this comparison flawed?

What attributes does the machine build up in the model from a few thousand images of dogs and cats which then allows it to take a completely new image and assign a value of dog or cat to it?

Are you seriously suggesting that being able to identify what is and is not a cat (black and white) is the same as being able to verify an item's condition (more than 2 colors/shades)?

RhialtoTheMarvellous · March 2, 2021

Just now, Angel of Death said:

Are you seriously suggesting that being able to identify what is and is not a cat (black and white) is the same as being able to verify an item's condition (more than 2 colors/shades)?

The difference between a dog and a cat to a computer are not black / white it only seems that way from the human perspective. In reality, you're unconsciously examining a hundred different things about the image to contextualize it as a dog or a cat.

If you give a computer a picture of a dog or a cat without any identifier, say image007.jpg it has no idea what the heck it is at all. It can't classify what it shows in any meaningful way at all like a human can. It knows only things that it is programmed to understand by the operating system. It could tell you what the name of the file is, whether it's a PNG or JPEG or how big it is in KB, but it won't be able to tell you that it's a picture of a dog or a cat.

The idea behind machine learning is giving the machine the ability to do that, by training a model. You train a model using an existing dataset where a large set of data is broken into the categories you establish (dog or cat or 9.8, 9.6, 9.4 etc).

For instance you take 2000 images of dogs and put them all into a folder/category named dogs.
Then you take 2000 images of cats and put them all into a folder/category named cats.

You let the computer examine the image sets and heuristic algorithms establish the attributes that the computer will use to identify the images inside of the model. This is something of a black box in that we don't necessarily always know what the computer is building up in the model to discern the differences.

The training algorithm will take a subset of the given data, say 3/4 of it, and build up the model and then it will validate the model against the last 1/4 of the data (ie, use it against the AI it has made without giving it any category information to see if it gets the answers right) giving you an idea of whether it works or not. It's an amazing thing. The only rub is obtaining the data. It's pretty easy to find pictures of dogs and cats, but not so much pics of comics with certain grades.

theCapraAegagrus · March 2, 2021

10 minutes ago, RhialtoTheMarvellous said:

The difference between a dog and a cat to a computer are not black / white it only seems that way from the human perspective. In reality, you're unconsciously examining a hundred different things about the image to contextualize it as a dog or a cat.

If you give a computer a picture of a dog or a cat without any identifier, say image007.jpg it has no idea what the heck it is at all. It can't classify what it shows in any meaningful way at all like a human can. It knows only things that it is programmed to understand by the operating system. It could tell you what the name of the file is, whether it's a PNG or JPEG or how big it is in KB, but it won't be able to tell you that it's a picture of a dog or a cat.

The idea behind machine learning is giving the machine the ability to do that, by training a model. You train a model using an existing dataset where a large set of data is broken into the categories you establish (dog or cat or 9.8, 9.6, 9.4 etc).

For instance you take 2000 images of dogs and put them all into a folder/category named dogs.
Then you take 2000 images of cats and put them all into a folder/category named cats.

You let the computer examine the image sets and heuristic algorithms establish the attributes that the computer will use to identify the images inside of the model. This is something of a black box in that we don't necessarily always know what the computer is building up in the model to discern the differences.

The training algorithm will take a subset of the given data, say 3/4 of it, and build up the model and then it will validate the model against the last 1/4 of the data (ie, use it against the AI it has made without giving it any category information to see if it gets the answers right) giving you an idea of whether it works or not. It's an amazing thing. The only rub is obtaining the data. It's pretty easy to find pictures of dogs and cats, but not so much pics of comics with certain grades.

The computer is only assessing whether or not an image is a cat or dog. It's black and white.

RhialtoTheMarvellous · March 2, 2021

2 minutes ago, Angel of Death said:

The computer is only assessing whether or not an image is a cat or dog. It's black and white.

Are you confused about this because there are multiple categories in the comic book example (10.0,9.9,9.8,9.6... etc) versus say the dog/cat example?

That isn't really an issue. You can establish any number of different categories for the computer to evaluate as long as you have the sample data for those categories. You could have the computer evaluate dog/cat/fox/beaver if you wanted to. This is in fact what Google and other big data companies do for their image search.

Need some help from the forums for an experiment in Machine Learning for grading comics. 2 2

74 posts in this topic

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Need some help from the forums for an experiment in Machine Learning for grading comics.

2 2