Posted by: Brian Hibbs on June 29, 2007
Well! I *finally* got the pre-populated database for MOBY (took nearly a month because I had some fussy requests for them, and I wanted a go at the data without having to use MOBY itself to access it — MOBY is nice as a POS program, but kinda of mediocre as a let-me-edit-100k+ items; I’m using Excel to manage the data), so this is now my life for the next 2-4 weeks.
The MOBY database goes back something like 5 years, and has something on the order of 140k items in it. Now, of course, AT LEAST 70k of those items aren’t things I’d EVER stock in my store of my own volition (like, say, games, or cards, or XENA t-shirts), but it’s nice to have them in the database on the off chance that anyone might want them one day.
Of course, 140k items? Yeah, that’s a LOT.
So, step #1 was to hack that into more manageable chunks — I started by sorting out the database by Diamond “category” codes: code “1″ is comics, “2″ is magazines, “3″ is books, and so on. 1, 2, and 3 each have their own files right now, so I don’t “cross the streams”, while the final 12 or so categories I split into two files. For the most part, I probably will barely touch the final 12 categories (since we’re, pretty much, a PURE comics shop), but I’m still going to have to at least look through each and every item to make sure.
Step #2 was to loosely sort the “chunks”. Actually, that took all of five minutes per file, if that — sort the “comics” by publisher, then by title, then by issue #, so that, say, all of the DC books are together, are alphabetical within the “DC” listing, and are in proper alpha numeric order.
This, of course, assume that the data is both accurate and complete, but of course it isn’t precisely. Probably 98% looks pretty perfect, but there are definitely holes and problems and miscategorizations. As an example, it appears that every book with a “APR07″ code either lost, or never was given its “publisher” field, which means that sorting through data is a two-step process: step a: look at the publisher’s bit; step b: look at the “no publisher listed” bit. Oh well, shit happens.
Step #3 is where I am right now: going through those files and looking for things that do or will have. For example, the “comics” file has 36,298 items in it. I’m going to get that down to probably under 2k items, really, since “Back issues” aren’t going to be individually tracked through the POS.
So, today I started going through the store with two goals: a) strip some of the “sludge” from the racks, stuff that’s been sitting there for (likely) a year or more, almost all of that “indy” or “alternative” books, because, generally, a book comes OFF the rack when the “next issue” is there to replace it. Much of the “indy” stuff never GETS a “next issue”, so it tends to accumulate longer. This goal is “pretty much” done (for COMICS) in the 4 hours I was at CE today — I ended up with a shortbox of “unsalable crap” (estimate: $400, my cost)
Goal b) was to set up a new column in the database that’s basically either a “1″ or a “0″. “1″ = “do inventory on this”. I almost finished the comics on the right side of the store. I’m going to go in ’round 9am tomorrow to take advantage of 90 minutes or so of no-customers-underfoot, but I’m hoping by, dunno, 3 PM or so I’ll be basically finished with the “do we have this in stock/do we anticipate it being in stock soon” marking up of the comics file.
Once that’s done, then we move to Step #4 which is going through and editing what needs to be edited in the portion of the comics list I’m “keeping” (we’ll keep the data of the OTHER 34k-ish items, but basically trusting “MOBY Defaults” are all correct) — there I’ll be setting reorder points, primary/secondary/tertiary distributor, genre and so on.
There are, by the way, 52 (!!!) columns in each row of the database that I could edit if I wanted/needed to. Though I doubt I’ll edit more than 5-10 of them for most items.
There’s also Step #5: Look for things that I DON’T have on my rack, but that I WANT to have, and have “slipped through the cracks” over the years. That’s a day’s work, I think (I’m doing that semi-concurrently, and have finished all of the publishers that begin with an “A”)
Then after that, is Step #6: printing a list, organized by rack (genre), and doing a hard physical inventory; followed by Step #7: entering that data into the computer; followed by Step #8: POS goes live (#6 & 7 will happen “concurrently”, hopefully in one single day. #8 should be the day after)
I have to do much of those same steps for the “magazine” and “book” files. (much less for the two “everything else” files)
Mags will be pretty simple, actually — again, don’t carry most of that, and I think Books will go smoother as well because I won’t have to keep running to the rack to say “Wait, what issue is this on?”. Books are largely binary in “do I carry this or not”, while periodicals are “I carry this, but not issues six months back” (or whatever), and I have to constantly keep checking, for instance, “what issue is WONDER WOMAN on?”
Based on my progress today, I’m going to tentatively believe that I can accomplish all of this over 3 weeks or less, and my HOPE is I’ll be able to “go live” with MOBY on or about 8/1. (I actually *think* I can get it done in around 10 days, and *could* be “live” by 7/15, but I’m still going to settle for 8/1 or later, because “pad” is always better)
So that’s what I’m doing, work-wise, for the 11 of you that care. Lots of pain now in exchange for some powerful tools for the future.
I’ll try to get in 1 or 2 reviews this week, but don’t count on it, given the above.