Our Own MTG Database

Sep 11, 2025

It’s Been a While

I know it’s been some time since the last development update — but don’t worry, work is very much ongoing.

Over the past couple of months, we’ve been focusing on three closely related tasks: building our own MTG card database, collecting card photos, and developing the recognition module.

Because of their interdependencies, we had to keep switching back and forth between them.

As a result, none of these tasks felt “finished enough” to talk about… until now.

 

Data and Images

To train the recognition module, we need photo samples of every card.

And to implement sorting, we also need properties for each card — things like mana cost, color, mechanics, and so on.

Fortunately, we didn’t have to start from scratch. There’s an excellent resource — scryfall.com — which provides an API and generously allows developers to use MTG card data and images for applications or devices, without special restrictions, subscriptions, or royalties. For that, they truly deserve a big thank-you!

A small script was enough to quickly fetch all the data we needed. It’s a one-time job — from then on, we can work with our own copy, only updating it occasionally (maybe once or twice a month) when new sets are released.

Still, that alone didn’t feel like enough progress to share in a blog post.

 

 

Filtering What Really Matters

From the downloaded dataset, we had to extract only the parts truly needed for recognition and sorting — and that turned out to be trickier than expected.

The recognition module wasn’t finished yet, and new tests kept revealing issues. Solving those directly affected the data format, the choice of database engine, and a number of other technical decisions.

Only after many experiments — and fixing the major problems — did the recognition module become solid enough. It’s now in the final stages of large-scale testing (but that’s a story for another post).

Just a couple of weeks ago, we were finally able to settle on the technical foundations and complete the import module.

Here’s what our own MTG database looks like today:

  • Cards: 100,542 records
  • Sorting traits: 518,115 entries
  • Card images: 103,987 files (16.8 GB)

 

Data Analysis

You may have already noticed that the number of image files is slightly higher than the number of cards. ;) The reason is simple: some cards have artwork on both sides, and since either side might end up under the camera, both need to be recognized — which isn’t hard to implement.

What’s more concerning is that the same artwork can sometimes be reused across multiple cards. A quick query to our database shows that, in extreme cases, one piece of art appears up to 41 times.

Some of those cards have noticeable design differences and can be distinguished reliably, but others look almost identical — with only a few pixels setting them apart.

For sorting by name, mana, color, mechanics, and similar traits, this won’t be a major issue, since cards sharing the same art usually share the same properties. But when sorting by price, those small differences might matter.

This is something we’ll tackle in the recognition module later.

As for the MTG data import, though — that task is now complete.

 

What’s Next

The recognition module is almost ready — we’re running the final tests and processing all available card photos through it.

Once it’s complete, expect another blog post with details and results.

As for nearly identical cards that reuse the same artwork, we don’t have a clear solution yet — but we do have a few ideas to explore.

 

To be continued...

 


 

Want to support the project?
See how you can help →