1974ER wrote:But... wouldn't it have to check absolutely every single character's GUID against the database anyway? Meaning... open file, check all GUIDs against database, then go through all that need a check for level up / guild change, execute and save changes, close file, move on to next file. In other words, having less overall characters to check would be better, right?
Theoretically, yes... but for small files, the copying and switching folders would last longer than the upload itself... and for big files, it would increase the time needed as well, because one can't upload the file WHILE it's being copied either.
And the problem with database is more related to it's size than the uploads... you might have not realized this yet... but according to the DB stats info... it currently contains almost 94 MILLION different characters. That a HUGE pile of data to shift through, no matter how well indexed it is.
EDIT: Typo + congratulations on your first new ones and updates!

No, it would only need to check the list of GUIDs until it finds one, and then it knows which characters to ignore.
It's like this... (census number, guid, date, server, character who took the census)
Census=1, GUID-48451487498454874987, Nov 30 2010, 3:00 PM, Firetree, Pencey
Census=2, GUID-91564984584894748496, Dec 01 2010, 6:00 PM, Firetree, Pencey
Census=3, GUID-16489451458979874874, Dec 02 2010, 9:00 PM, Firetree, Pencey
(server, faction, name, level, census number)
Firetree, Alliance, RarelyPlaysDude, level 30, Census=1
Firetree, Alliance, Pencey, level 54, Census=2
Firetree, Alliance, 1974ER, level 77, Census=3
... and way more characters with varying Census=# between 1-3.
The higher the Census=#, the more recent the update.
Now imagine the first time I upload this file is after the 2nd census (so the update of 1974ER is not there). It will add both RarelyPlaysDude and Pencey to the database and update their level. And then at the end, it'll record that GUID for the 2nd census in the database.
Now, when I upload the file after taking the 3rd census (and pencey and rarelyplaysdude were not seen in this example, so don't need to be updated), the script looks up the Census=3 GUID in the database.. it's not there, so it knows if Census=3, it should update.
It then looks up the Census=2 GUID. It finds it. Therefore it knows to ignore all chars with Census=2 or even Census=1.
There's some logic that needs to be thought about for how how to get this to work for multiple characters and servers, but that's the basic idea.
10 MB takes mere seconds to copy. You're losing like 10 seconds here making a copy of a file (Delete previous copy. select current file. Ctrl+C, Ctrl+V, wait all of 5-10 seconds for it to copy.. you don't even need to change directories).
Yeah, the size of the database is a big problem. Doing so many queries(searches) on a large database is going to slow down processing. So we want to reduce the number of queries. I think my idea could reduce the number of queries by at least 50% (probably more like 90% when it comes to people who don't need to prune/purge).
Although, perhaps it's the updates that are the slowest part, in which case all we can do is submit less (non-redundant) data...
