Need some help from the forums for an experiment in Machine Learning for grading comics.

2 2

VintageComics · March 3, 2021

You can't grade a book from a 2 dimensional scan because grading a comic takes more than even 3 dimensions.

Not only are you looking at obvious physical defects, you need to take into account defects at angles, interior defects (pages, staples), possible smell, weight of the book (missing pages), how it feels in hand (how a book feels is important in some cases), interior of the covers, smudges / stains / tanning / dust shadows and how deep all of those defects go in the paper.

So much you can't see just from a 2 D scan.

The best you may be able to do is grade some otherwise perfect 9.8 books with no interior defects down somewhere lower into the lower NM range by adjusting for a few small defects.

Once you drop out of the NM range the amount of defects in grade go up exponentially and potentially include many more defects that go beyond just how the front and rear covers look.

comicginger1789 · March 3, 2021

Rather than dump on the experiment couldn’t we just acknowledge how interesting it is? And if someone wants to tackle that and share the results, how is that a bad thing?

I am intrigued and think the machine deserves a shot. If it can correctly predict grades within a certain degree of accuracy, why would that not be helpful as a base pre screen tool? Obviously the interior would need to be complete and devoid of certain flaws but it’s a starting point.

I will send scans of my ASM 300 as soon as it’s back from CGC!

allthingskryptonite · March 3, 2021

15 hours ago, RhialtoTheMarvellous said:

Grade a comic using machine learning
The tooling for building out Machine Learning models has become fairly easy to use and I wanted to see if I could build a model that can grade a comic. I've got the tooling to create the model, but what I need are images.

As a start I wanted to focus on one particular comic book. The reason I only want to use one book is because of the evaluation criteria. The idea is that we want to be able to classify a book into a particular score. So the criteria is much more narrow than for instance if we were going to recognize whether a picture has a comic book in it. That case would be relatively easy to model.

I was thinking ASM #300 would probably be something people could dig up easily in various grades. Granted we won't probably see any 2.0s, but we can probably get a decent sample of images from 7.0 to 9.8.

To be clear, this is not a replacement for the CGC service and I have no intent to make any of this into a pay to play product. I honestly have no idea whether it will work or not due to the image sample requirements and it's difficult to create something like this just for one comic book much less the millions of them that exist out there. Given the variety of defects possible I will likely need many samples at each score level to get something that can discern accurately between a 9.4 and a 9.8.

What I need

Raw front and back cover scans of ASM #300. One of each per copy. I must have both the front and back cover.
Obviously this could be difficult to find since people don't always scan a comic front and back before getting it graded.

The grade assigned to those front/back scans as a number (9.8, 9.7, 2.0, etc), no text grades please.

Image caveats

No cropping of edges, the comic front and back cover must be fully visible.

No partial images (ie corner shots, close ups of defects etc).

The defects that cause the grade issues must be visible in the scan.
This won't work for interior page defects. Keeping it simple for now.

It won't work for books already in a holder.

Try to keep them reasonably sized.

JPG or PNG only.

The more scans at each grade that I can obtain the better the model will end up being.

If it ends up that it has decent accuracy I'll make the model publicly available and put up a web page where you can submit your own images of ASM 300 for grading. If anything it might be a halfway decent pre-screen.

This sounds like a very interesting project. I can't go through the entire thread right now, but I sure will sometime tomorrow.

I understand a bit of Neural Networks and comic book grading, so jotting down my initial thought.

I would break the problem in two stages.

First I would start with something simple.

There are nuances that affect comic book grades. Conditions like mold, foxing, stains, staple rust, staple pop, centerfold detached, water damage, subscription crease, residue -- these affect the grade.

I would train a simple feed-forward neural network trained on one hot vector of features, where a feature is a condition being present or absent (mold, foxing, stains, staple rust, staple pop, centerfold detached, water damage, subscription crease, residue), which we would extract from grader notes (text data).

Although this is a very primitive design, this simple setup would be able to make some decent predictions.

Imagine how different features in your one hot vector help train the simple classifier:

Is cover missing? Yes or No. If Yes, then grade is always 0.3

Pages missing? Centerfold missing? Yes or No. If Yes, then most likely a 0.5

Subscription crease? Book length crease? Yes or No. If Yes, then grade most likely in range ~4.0

Once I get this primitive design to work with a reasonable accuracy, then I would attempt the bigger problem.

Trying to design a CNN that takes image data as input to predict the comic book grade, with the one hot vector of features being the intermediate latent variables.

The accuracy of the network would depend on the amount of data you have. But I think it will be able to predict ranges, like a book being in vicinity of GD, VG, F or VF or so on.

Let us know if you make any progress.

Jginsberg79 · March 3, 2021

A few initial thoughts:

For expanding data sample why not simply use modern books or bronze age? If you limit the initial model training to specific atomic characteristics, they should remain consistent across the sample ( spine tics, % of cover missing, color breaks, etc)
Why try to create a grade initially? It seems like a report of the characteristics is the first stage or goal for generation. The model or a clustering model, could be used later to try to create a grade but initially, before even hammering out core metrics, it's adding confusion in the thread and another level of complication (ie failure).
This can be done. Insurance companies use ML modeling on car claims from adjusters to make initial projections for returns. Travel companies use it to make recommendations on hundreds of thousands of photos (air bnb, booking, expedia, etc). There's a ton more examples but for identifying and measuring set parameters, it can be done.
For standardization of photo quality, we're this to ever be more then a fun excercise, it would likely need to be a mail in service or have some parameters around resolution, lighting, etc. However, I'd consider all of that a later problem for new users. Yes there would have to be some data cleaning for photos contributing to model training but OP already discussed some of that.
Tensor Flow and the Google Cloud suite is a great place to start though I'm biased.

Happy to discuss more here or elsewhere too! Ive been meaning to try this for a while too so glad to see there's dozens of us :-)

theCapraAegagrus · March 3, 2021

4 hours ago, comicginger1789 said:

Rather than dump on the experiment couldn’t we just acknowledge how interesting it is? And if someone wants to tackle that and share the results, how is that a bad thing?

I am intrigued and think the machine deserves a shot. If it can correctly predict grades within a certain degree of accuracy, why would that not be helpful as a base pre screen tool? Obviously the interior would need to be complete and devoid of certain flaws but it’s a starting point.

I will send scans of my ASM 300 as soon as it’s back from CGC!

No.

RhialtoTheMarvellous · March 3, 2021

7 hours ago, onlyweaknesskryptonite said:

Since CGC has stated that generally they do not scan before and only after if you paid/requested one, I believe a better chance at you getting the images with grades you seek, would be one of the professional pressers who document / scan all their work.

That's a good point. I'm actually on a FB group for pressers.

RhialtoTheMarvellous · March 3, 2021

2 hours ago, Jginsberg79 said:

A few initial thoughts:

For expanding data sample why not simply use modern books or bronze age? If you limit the initial model training to specific atomic characteristics, they should remain consistent across the sample ( spine tics, % of cover missing, color breaks, etc)

Why try to create a grade initially? It seems like a report of the characteristics is the first stage or goal for generation. The model or a clustering model, could be used later to try to create a grade but initially, before even hammering out core metrics, it's adding confusion in the thread and another level of complication (ie failure).

This can be done. Insurance companies use ML modeling on car claims from adjusters to make initial projections for returns. Travel companies use it to make recommendations on hundreds of thousands of photos (air bnb, booking, expedia, etc). There's a ton more examples but for identifying and measuring set parameters, it can be done.

For standardization of photo quality, we're this to ever be more then a fun excercise, it would likely need to be a mail in service or have some parameters around resolution, lighting, etc. However, I'd consider all of that a later problem for new users. Yes there would have to be some data cleaning for photos contributing to model training but OP already discussed some of that.

Tensor Flow and the Google Cloud suite is a great place to start though I'm biased.

Happy to discuss more here or elsewhere too! Ive been meaning to try this for a while too so glad to see there's dozens of us :-)

1. If there is one thing I've gotten out of this thread (and I've gotten more than that) it's the idea that I might run an experiment of this sort initially on binary characteristics like spine ticks present or not or creases in cover or not to see how well I could train a model in that regard. This would also be an easier experiment from a data collection perspective as I'm sure everyone could come up with books both with and without spine tics.

2. Well, that's part of the deep learning problem solving aspect and one reason why machine learning can be so valuable, because it can derive outcomes using combinations of factors that aren't always evident to humans. You give the machine a bunch of data on one end and a known set of results on the other and then let it interpret the factors that differentiate the source from the results on its own. The big thing with this is training a model to predict medical conditions. You give the machine publicly available health data of thousands of people and which ones get a certain condition and which ones don't and then it can predict with a fair degree of accuracy whether an individual it is given is at risk for that condition.

4. There is definitely some standard necessary. I'm not yet sure what it is, but the CGC images are pretty weak in that regard if the CGC registry is any indication. A lot of the things I pulled off there are not even scans. They are photos of a slabbed book with bad lighting or bad angles or photos of just the score or the top of the holder.

Jginsberg79 · March 3, 2021

31 minutes ago, RhialtoTheMarvellous said:

1. If there is one thing I've gotten out of this thread (and I've gotten more than that) it's the idea that I might run an experiment of this sort initially on binary characteristics like spine ticks present or not or creases in cover or not to see how well I could train a model in that regard. This would also be an easier experiment from a data collection perspective as I'm sure everyone could come up with books both with and without spine tics.

2. Well, that's part of the deep learning problem solving aspect and one reason why machine learning can be so valuable, because it can derive outcomes using combinations of factors that aren't always evident to humans. You give the machine a bunch of data on one end and a known set of results on the other and then let it interpret the factors that differentiate the source from the results on its own. The big thing with this is training a model to predict medical conditions. You give the machine publicly available health data of thousands of people and which ones get a certain condition and which ones don't and then it can predict with a fair degree of accuracy whether an individual it is given is at risk for that condition.

4. There is definitely some standard necessary. I'm not yet sure what it is, but the CGC images are pretty weak in that regard if the CGC registry is any indication. A lot of the things I pulled off there are not even scans. They are photos of a slabbed book with bad lighting or bad angles or photos of just the score or the top of the holder.

In regards to #2, you're still trying to abstract a subjective measure from objective data without an empirical standard and without the ability to measure the full spectrum (such as page quality beyond just "color"). If you limit your variables and output, initially, to variables and not a subjective measure (yes cgc us in many ways subjective) thats the first phase of testing.

VintageComics · March 3, 2021

7 hours ago, comicginger1789 said:

Rather than dump on the experiment couldn’t we just acknowledge how interesting it is?

That it's interesting goes without saying. I didn't read the whole thread but I personally wasn't dumping on it. I was sharing my experience with comic grading and why I don't think it will work.

But hey, all power to him!

comicginger1789 · March 3, 2021

33 minutes ago, VintageComics said:

That it's interesting goes without saying. I didn't read the whole thread but I personally wasn't dumping on it. I was sharing my experience with comic grading and why I don't think it will work.

But hey, all power to him!

I don’t know that your comments were negative but I get the skepticism! Either way whether it works great, or is 90% effective or even 50% effective it would be neat to see!

Spyder! · March 3, 2021

As someone who works with machine learn models at work all the time, I think it’s a super cool idea. Yes, there are challenges. Obviously grades are not based on the cover alone. But wouldn’t it be interesting to see how close you can get to predicting the grade given by CGC even when your data is limited to just the cover scan? I sure think so.
In practice, I can see a use for it as a way to identify bad grading of raw books, probably not for actual grade assignment. In other words, a seller lists a book for sale as VF, but based on visible defects on the cover alone, the model may be able provide a probability of such a grade being given, and below a certain threshold value, you might conclude that the seller has over graded (shocking!).

Anyway, sounds fun to me. I’d love it if folks would help you get some data together. Good luck!

Brock · March 3, 2021

This is a pretty fascinating discussion. It seems to me that if companies like CGC (and its various competitors) aren't already looking at this kind of approach, they're running a real risk of being displaced by startups that will. This leads me to several thoughts:

2-D scans and photographs are probably an inadequate resource for the effective development of this tool, aside from an interesting "hobby" project. Reference to NCB defects, for example, suggests these might not be visible in a 2-D image, though they should be visible in a 3-D image. Similarly, tools to assess the interior of a book could be utilized for a more comprehensive assessment than a simple 2-D image of the covers.
This kind of approach will ultimately be attractive to CGC or others as Moore's Law and similar effects dramatically diminish the cost of these 3-D imaging systems over time, until they reach a point where they are far more cost-effective than human labour.
While these systems will inevitably hold a financial advantage for users, they could ultimately have a performance advantage as well. On these boards we constantly lament the quality control issues at CGC, and we often talk about how grading is subjective, and how (even at CGC) it varies over time. Using machine learning to develop consistent grading and quality standards is an attractive possibility.
There are always funds to assist companies in doing research into innovative process improvement activities like this (and this is the area I now work in). As a U.S.-based company, CGC could easily tap into funding from the National Science Foundation's Small Business Technology Transfer Program to support this, and if the OP is a researcher at a college or university in the United States, it's possible that he could be paid to do this research with them.

TLDR version?

While using 2-D scans may not be the best way to approach this opportunity, the core idea is viable, and a smart company in this space would already be exploring it.

RhialtoTheMarvellous · March 3, 2021

1 hour ago, Brock said:

This is a pretty fascinating discussion. It seems to me that if companies like CGC (and its various competitors) aren't already looking at this kind of approach, they're running a real risk of being displaced by startups that will. This leads me to several thoughts:

2-D scans and photographs are probably an inadequate resource for the effective development of this tool, aside from an interesting "hobby" project. Reference to NCB defects, for example, suggests these might not be visible in a 2-D image, though they should be visible in a 3-D image. Similarly, tools to assess the interior of a book could be utilized for a more comprehensive assessment than a simple 2-D image of the covers.

This kind of approach will ultimately be attractive to CGC or others as Moore's Law and similar effects dramatically diminish the cost of these 3-D imaging systems over time, until they reach a point where they are far more cost-effective than human labour.

While these systems will inevitably hold a financial advantage for users, they could ultimately have a performance advantage as well. On these boards we constantly lament the quality control issues at CGC, and we often talk about how grading is subjective, and how (even at CGC) it varies over time. Using machine learning to develop consistent grading and quality standards is an attractive possibility.

There are always funds to assist companies in doing research into innovative process improvement activities like this (and this is the area I now work in). As a U.S.-based company, CGC could easily tap into funding from the National Science Foundation's Small Business Technology Transfer Program to support this, and if the OP is a researcher at a college or university in the United States, it's possible that he could be paid to do this research with them.

TLDR version?

While using 2-D scans may not be the best way to approach this opportunity, the core idea is viable, and a smart company in this space would already be exploring it.

It's interesting that you say that. At one point I decided to test out my new scanner and took a 1200dpi scan of one of my books. The image ended up being huge of course. I opened it up and started zooming in on various areas of the book looking closely for defects. The image was so detailed that I found myself zooming in a lot and what I found is that zooming in on a 1200dpi image is like using a microscope on it. There are so many scratches and marks that are completely invisible to the naked eye that you can detect at that resolution.

This is partly why I wanted to try just using 2D scans at first, because it occurred to me that even at a lower scanning resolution there are probably patterns a computer can detect that a human can't detect.

Jginsberg79 · March 3, 2021

Also, if I sounded critical I apologize. This is a very cool project.

RhialtoTheMarvellous · March 3, 2021

3 minutes ago, Jginsberg79 said:

Also, if I sounded critical I apologize. This is a very cool project.

Not at all. I just wish I had something more interesting to share at this point than the number of scans that CGC has on their website.

VintageComics · March 3, 2021

3 hours ago, RhialtoTheMarvellous said:

It's interesting that you say that. At one point I decided to test out my new scanner and took a 1200dpi scan of one of my books. The image ended up being huge of course. I opened it up and started zooming in on various areas of the book looking closely for defects. The image was so detailed that I found myself zooming in a lot and what I found is that zooming in on a 1200dpi image is like using a microscope on it. There are so many scratches and marks that are completely invisible to the naked eye that you can detect at that resolution.

This is partly why I wanted to try just using 2D scans at first, because it occurred to me that even at a lower scanning resolution there are probably patterns a computer can detect that a human can't detect.

Agreed. Most people don't realize this but there are many defects that if you focus on make the book look horrific when at arm's length, with normal vision it wouldn't move the needle. you'

This is one of the concerns with grading.

Some people have better than 20/20 vision and I think there is an eagle eyed grader or two at CGC that may affect the grading in a direction of being too tight at times.

There are definitely some defects that just need to be ignored because at arms length, to the human eye they wouldn't even register.

And this might be where software would be of benefit. You would get consistency, but you'd need to set a baseline where the program would see the book as a comic, at arms length, with 20/20 vision as our baseline.

Also, scratches in gloss, while invisible to the naked eye at arms length would get factored into the grade on ultra high grade books.

Lots of factors to consider.

bronze_rules · March 4, 2021

17 hours ago, allthingskryptonite said:

I would train a simple feed-forward neural network trained on one hot vector of features, where a feature is a condition being present or absent (mold, foxing, stains, staple rust, staple pop, centerfold detached, water damage, subscription crease, residue), which we would extract from grader notes (text data).

I like this idea a lot for an experiment. Except that you'd be relying on grader notes. I was originally thinking along original poster lines of just automating image features (e.g. jpg) through Deep NN. But this would be a lot easier to validate different sets and finding feature importance, etc.. Someone could come up with a nice learner response by manually filling out the questionnaire and feeding the learned network. XGBoost would be nice here.

Still a dependency on grader notes here. But if someone was ambitious enough, you could go book by book (any book not just one unique book) jot down the features and grade outcome, and create your own dataset to train the learner. A lot of work, but plenty of books to visibly find on internet sources (heritage, etc) and enter data manually.

allthingskryptonite · March 4, 2021

36 minutes ago, bronze_rules said:

I like this idea a lot for an experiment. Except that you'd be relying on grader notes. I was originally thinking along original poster lines of just automating image features (e.g. jpg) through Deep NN. But this would be a lot easier to validate different sets and finding feature importance, etc.. Someone could come up with a nice learner response by manually filling out the questionnaire and feeding the learned network. XGBoost would be nice here.

Still a dependency on grader notes here. But if someone was ambitious enough, you could go book by book (any book not just one unique book) jot down the features and grade outcome, and create your own dataset to train the learner. A lot of work, but plenty of books to visibly find on internet sources (heritage, etc) and enter data manually.

Yes to perform the first part one would need grader notes as text, and then extraction of relevant features. Such information might be hard to come by. One avenue that I can think of is crawling mycomicshop website for raw books and their grader notes. This exercise is just to verify that given a set of simple features, can we make okayish predictions for grades? I believe the answer to this would be yes.

On second thought, the problem the OP wants to attempt -- predicting grades from purely image scans -- appears extremely challenging. Imagine a book that appears F/VF but has centerfold missing, it is a 0.5. If we only have image scans and no additional info regarding the missing centerfold then getting the correct prediction (0.5) is almost impossible.

bronze_rules · March 4, 2021

2 hours ago, allthingskryptonite said:

Yes to perform the first part one would need grader notes as text, and then extraction of relevant features. Such information might be hard to come by. One avenue that I can think of is crawling mycomicshop website for raw books and their grader notes. This exercise is just to verify that given a set of simple features, can we make okayish predictions for grades? I believe the answer to this would be yes.

On second thought, the problem the OP wants to attempt -- predicting grades from purely image scans -- appears extremely challenging. Imagine a book that appears F/VF but has centerfold missing, it is a 0.5. If we only have image scans and no additional info regarding the missing centerfold then getting the correct prediction (0.5) is almost impossible.

Right. But I'm thinking of the centerfold missing, being more of an outlier. So, you read in the raw image, create a flat vector of the 2D image and train on grade target. Each image has to be perfectly aligned or have an algo to force it, along with presumably same consistent method of capturing image (this is another reason why I think cgc db was so good). Once the image has been scanned and grader has outcome, they do a preliminary check of other features, centerfold present? y. page missing n. Ok. grade verified. But, again I would think these cases would be outliers. The idea is to get volume pre-scans for the majority to reduce workload, provide objective rules, and speed up tasks. We could also add features on top of the image vectors. e.g. F1) number ticks 3 F2) longest crease etc. But these are adding more work to the data preparer. I truly think the raw image is enough to get close to grader results on average with enough data and consistent scan methodology. Any deviation in image placement could kill the learner (spine miswrap for example) and has to be accounted for.

BTW, didn't know mycomicshop had grader notes, but this is a great data source if they do.

Edited March 4, 2021 by bronze_rules

allthingskryptonite · March 4, 2021

1 hour ago, bronze_rules said:

Right. But I'm thinking of the centerfold missing, being more of an outlier. So, you read in the raw image, create a flat vector of the 2D image and train on grade target. Each image has to be perfectly aligned or have an algo to force it, along with presumably same consistent method of capturing image (this is another reason why I think cgc db was so good). Once the image has been scanned and grader has outcome, they do a preliminary check of other features, centerfold present? y. page missing n. Ok. grade verified. But, again I would think these cases would be outliers. The idea is to get volume pre-scans for the majority to reduce workload, provide objective rules, and speed up tasks. We could also add features on top of the image vectors. e.g. F1) number ticks 3 F2) longest crease etc. But these are adding more work to the data preparer. I truly think the raw image is enough to get close to grader results on average with enough data and consistent scan methodology. Any deviation in image placement could kill the learner (spine miswrap for example) and has to be accounted for.

BTW, didn't know mycomicshop had grader notes, but this is a great data source if they do.

What you just described is very much doable. We need good quality scans -- and a lot of it. Grading companies like CGC or auction houses will have large corpus of images on which someone can train a network.

My point with the centerfold missing example was to highlight that we still have to rely on human intervention. A grader who inspects the book for defects and note them.

Regarding my comments on mycomicshop -- it publishes major defects only for relatively lower grade raw books. It does not provide general grader notes for all books.

Need some help from the forums for an experiment in Machine Learning for grading comics. 2 2

74 posts in this topic

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Need some help from the forums for an experiment in Machine Learning for grading comics.

2 2