Another good news, bad news, other news day.
I found (again) the process that updates the data that feeds Census.php pages.
This data currently gets updated once a day early AM hours site time.
After discussion with Metalbeast, I have gone from gentle nag mode to active fix mode.
I intend to cut the lag time for this data down from the as much as 24 hours to hopefully 3 to 6 hours.
This final update process is time consuming and some what computer resource intensive.
I have three conflicting goals, Metalbeasts wants and needs concerning site performance and requirements, Users wants and needs for more and better data then we can provide, and my absolute limits of what I will and won't do to balance the other needs.
The processes that update the Census.php page also cleans out data, but not in a way I find acceptable.. Nor that Metalbeast found acceptable (i.e. currently it is broken burning cpu cycles and doing no real work

)
The changes that are going to be made will be gradual, but probably faster then the two very old level 20 characters that were removed from the data today. (After seven hours of research and testing.)
Note old data that has already been removed can not be replaced.. just can not happen .. sorry about that.
All characters level 20 and below will be gone some day.
My current testing has a last seen cut off date of the day before Cataclysm release.
This will very slowly cycle forward up to the current expansion release date.
There are currently over 2 million characters at level 20, I expect many of them are starter accounts and will never go higher.
I do not want them in the database.
Even if most of my characters are in that group.
Blizzards decision to allow inactive character names to be grabbed and reused is one of my clean out points.
If someone on a realm grabs an inactive name and uses it on the same faction/race/class it is hard to know that it is a new player behind the curtain.
Levels 21 through (yet to be determined) that have not been active since the cutoff Blizzard uses above will be removed based on a level-time curve (also yet to be determined.)
When I get to it, the 293 millions rows of history data will also be shrunk.
Again it will be done based upon a level-time curve (nothing more then a vague idea so far.)
At a minimum the existing x5/x0 records will be kept.
More active characters will have more current records kept.
Less active characters will more likely find lower non-x5/x0 records that still exist going away.
Same level records where the character goes in guild/out of guild/ back in same guild.. will have the out of guild records removed.
And I have no clue what else will happen, but I hope it ends up being the least displeasing to all parties.