Most important stat missing

Suggestions for WarcraftRealms.com
BambisFaline

Post by BambisFaline »

1974ER wrote:
Quote: "The only way you know that a certain time produces a high population is...wait for it...multiple samples run at the same time."
Ouch... Wrong, wrong, wrong... at least on the EU realms. I have not examined the US realms as closely, but all EU servers I have examined more carefully show similar (mind you, similar, not identical) patterns. Populations are often at their lowest from around 02 to 04, at their highest around approximately 19 to 22.
I'm afraid in this case you are wrong, because you are making assumptions based on data that you reject...the multiple survey position.

The only possible way you would know that populations are "often at their lowest from around 02 to 04" is by multiple surveys.
And the only way to know that the highest population is not transient (peaking at 23.30, for example) is to run multiple samples covering DIFFERENT times, not just one.
Again, you are incorrect, though we may be using a different meaning for "transient".

To know that a population is stable (to move away from the word "transient") at a certain time is to see that population again and again at the same time, requiring multiple surveys. A survey done at 11;) and again at 23:30 will tell you nothing about the population at either 11:00 or 23:30 as the data sample is too small to draw a conclusion on.
You see, I think... and I do apologize if I am utterly wrong, that given the following situation we would pick completely opposite approaches:

Server x has 100 Alliance entries, 10 Horde entries.

I assume you would go for Alliance, seeing that is has strong data, which would make it easy to increase the the data quality. I on the other hand would probably go: "Horde has miserable amount of data, I should fix that."
Here we do indeed disagree. But the problem is what you consider to be good data.

You equate high population to good data and low population to bad data.

Data is data.

I may be completely misunderstanding you, so I beg your forgiveness. But the position you seem to be taking is equal to a census of Paris, France would be good data (because of high population) but a survey of Bucktooth, Florida would be bad data (because of low population).

Population is whatever the population is. The data is good or bad based on the quality of the data.

Now, we can change that a bit around. If you want to know the population of France, you can survey huge population centers and ignore the rural communities and you'll have a pretty good idea of the population of France. On the other hand, if you survey the rural communities and ignore the high population areas, you'll have a rather bad idea of the population of France.

In either case, though, only multiple surveys will tell you if your data is good or bad. Perhaps that high population area was a high population area due to some external or temporary circumstance? The only way you'd know is multiple surveys.

So again I go back to my point in the above post. Multiple surveys verify data. The data you want depends on what you are looking for, whether server makeup or population. But ultimately, multiple surveys are required for good data.

Going further, if we were given 30 days and told we are only allowed one census per day, you would assumably pick the time which had the most old entries and fell within you playing time. You would then proceed to run the daily census on the same strike of the clock every day to the best of your ability. And this even if the lesser quality data suggested that the peak population might be several hours off of your chosen time.

I, on the other hand would spread out a bit... most likely picking probable highest times for the first few days to feel things out and then variate a bit, maybe for example going 19, 20, 21, 19, 20, 21, etc...

At the end of 30 day period... the likely result would be (assuming the server is not horribly unbalanced) that we have both ran 30 censuses, I have seen more new characters, updated a few more, tied or lost just by a little, and you would have more "quality 30 characters" than I would.

The last line I agree with. :D And it seems I produced another "wall", sorry. :(

EDIT: Corrected a few minor typos. :)[/quote]

1974ER
Epic Censi
Posts: 762
Joined: Fri Nov 07, 2008 3:30 am

Post by 1974ER »

Bambi's Faline wrote:I'll go further and say that the stat displayed such as "WR Updates: 28,212" (my current stat) is disingenuous as it tends to give the impression that someone who only has 1,000 WR updates has done less work than my 27,212, even though he may have run 1,000 surveys on an empty server (besides himself) while I ran only 27 surveys during a time of 1K population.

An even more interesting stat (here may be another suggestion) is a ratio stat. How many updates / number of surveys turned in. That, it would seem to me, to be a more accurate number as to the amount of work a person does for WR as it breaks down into an average how many toons someone is seeing, giving credit to both high pop and high survey runs.
You double posted, so I am going reply to part one first and edit for part two later. I mostly agree with the parts I deleted. Now some comments: I agree with disingenious part, especially since someone can run a lot of censuses which produce little or even no new info, if the info was submitted "too late" and someone else has covered most/all characters the later submitting person saw.

As for the ratio, I already wrote about in my post of 8th, which you possibly didn't fully read. And I concur with your assesment, which is what I tried to say, you just managed to put it in words better. :D One minor reminder though, due to what I said in the previous paragraph, even that won't be fully correct, as different users have different submission habits and different realms receive highly different amounts of coverage. Some get 5+ censuses a day, others don't even quite manage once per month. :/

Finished with part one... at least for now and taking a break. Will continue a bit later. :D

EDIT: begining with part two...
Bambi's Faline wrote:The only possible way you would know that populations are "often at their lowest from around 02 to 04" is by multiple surveys.
Actually, that fact is (partially) derived from lack of data during those times even on servers with confirmed high populations. And yes, I am aware that my thinking is "dangerous", but the fact remains, that in some cases a lack of data forms other forms of data. Which is what is happening here.
Bambi's Faline wrote:Again, you are incorrect, though we may be using a different meaning for "transient".
I understood your meaning wrong, which led to a non-relevant conclusion. In other words, you were pointing out something else than what I responded to. :(
Bambi's Faline wrote:You equate high population to good data and low population to bad data.
Ummm... No, actually, no I don't. I equate well spread data to better data than concentrated data, especially if the total amount of available data is low. Using your example, in my opinion running a census in Paris (large population faction OR expected higher population) AND Bucktooth (small population faction OR expected lower population) on Monday, will give me a better picture of Earth's (realm's) population than running a census in Paris on Monday AND Tuesday.

Or even, running a census in Paris at 6 AM AND 8 PM on Monday, will give me a greater amount of knowledge about Paris than running a census at 3 PM on Monday and Tuesday.

But the main point I was trying to make was: If you want to have at least a bit of info from Earth's population and only get one snapshot, it is better to census Paris than Bucktooth. :D
Bambi's Faline wrote:Now, we can change that a bit around. If you want to know the population of France, you can survey huge population centers and ignore the rural communities and you'll have a pretty good idea of the population of France. On the other hand, if you survey the rural communities and ignore the high population areas, you'll have a rather bad idea of the population of France.
This isn't really a change, because you are now arguing my one snapshot case. Everything in your quote is true, but I was arguing the following: If we only get one snapshot, it is better to pick population centres (expected high time), but if we get two it's better to pick both rural areas (expected low) and population centres (expected high) on Monday, instead of examining population centres on Monday and Tuesday. This of course provided that we are more interested in France overall than getting higher quality data for population centres only. I prefer the former, you seem to prefer the latter.

The end of your post seems to be an unsuccesful quote from my post, which either was accidentally left in or you originally intended to write more than you did... But... it is way too late, so I am off to bed for now.

BambisFaline

Post by BambisFaline »

Well, I think we've run this discussion into the ground. I think we have found common in that we both agree that multiple, high population surveys produces the best possible data on the population makeup of a server. We can add even better data by obtaining multiple surveys of off hours, leading to a clear picture of who is on, and when.

To that end, a stat showing "total number of characters seen" or "total number of surveys taken" does not fully (or correctly if I may put forward) the amount and/or quality of work someone put in. But rather a ratio of characters seen/surveys taken would be a better, more accurate stat.

1974ER
Epic Censi
Posts: 762
Joined: Fri Nov 07, 2008 3:30 am

Post by 1974ER »

Bambi's Faline wrote:Well, I think we've run this discussion into the ground. I think we have found common in that we both agree that multiple, high population surveys produces the best possible data on the population makeup of a server. We can add even better data by obtaining multiple surveys of off hours, leading to a clear picture of who is on, and when.

To that end, a stat showing "total number of characters seen" or "total number of surveys taken" does not fully (or correctly if I may put forward) the amount and/or quality of work someone put in. But rather a ratio of characters seen/surveys taken would be a better, more accurate stat.
I am happy with the first paragraph's conclusions. :D

The second, however, requires some input from Rollie. Is it actually possible for you to see from our submissions how many characters we have seen? As that number is much higher than the number of updates + new ones. And if not, can we at least get the simpler:

(updated + new) / number of censuses ran

ratio as discussed by me and Bambi's Faline?

User avatar
Rollie
Site Admin
Posts: 4783
Joined: Sun Nov 28, 2004 11:52 am
Location: Austin, TX
Contact:

Post by Rollie »

That would be doable.
phpbb:phpinfo()

Post Reply