Without a Coon Again?????

Almost all my statistical posts are based on an assumption that the relationship between prices (Price Index) and results (PPI) is pretty close to linear.

I feel pretty certain that if your prices are above 1.00 and your results are below 1.00, that is a poor result. Conversely, if your prices are below 1.00 and your results are above 1.00, that is a good result.

I also feel pretty certain that prices and results below 1.00 correlate well. I am not so certain of the correlation between prices and results above 1.00.

My assumption has been that if you have a Price Index of 2.00, your PPI (results) should be around 2.00 as well. And the corollaries to this are that if your results are below 2.00, you underperformed, and if your results are above 2.00, you overperformed.

I have not really proved that this relationship is that close to perfect. It would take a lot of mathematics (correlation and regression, feeding a large amount of data into a computer) to establish the exact degree of correlation and to finger out the equation that describes the relationship between the two factors more exactly.

For example, it could be that a Price Index of 2.00 should produce a result of 1.90 or thereabouts (instead of 2.00, as I have so far assumed).

—————————————————————————————-

The paragraphs above are from a previous post (a sidebar actually) called Without a Coon????? Just for grins I decided to explore this idea a bit further by examining the results of all sales foals of 2003-2007 by their price ranges. Be forewarned that the following post is going to be even more mathematical than usual.

Below is the distribution of foals by price ranges.

Price Range               Foals          Average          Maverage          Price Index

$100-9,999              27,364         $3,895                 38.95                    0.24

$10,000-19,999      11,023        $13,576               115.76                    0.71

$20,000-29,999       7,031         $23,130              151.72                    0.93

$30,000-49,999       7,181          $37,067              191.91                    1.18

$50,000-99,999       8,579         $68,435              268.83                   1.65

$100,000-199,999   5,359        $134,410            364.81                    2.24

$200,000-499,999   3,280       $273,136            518.55                     3.18

$500,000-999,999      664        $650,369          794.85                      4.87

$1,000,000+                233       $2,087,339       1,105.62                    6.78

Totals                       70,714          $54,140            163.11                      1.00

No surprises here. The main purpose of classifying the foals thusly was to see how the results from the individual price ranges stacked up against their prices. Results are listed below. APPPSW stands for average Performance Points per stakes winner and is a measure of the quality of stakes winners involved (618 being average).

Price Range               Foals          Stakes Winners      %         APPPSW        PPI (Result)

$100-9,999              27,364                  389               1.42            468                  0.32

$10,000-19,999      11,023                  305               2.77            502                 0.66

$20,000-29,999       7,031                   231               3.29            568                 0.89

$30,000-49,999       7,181                   345               4.80           566                  1.29

$50,000-99,999       8,579                  486               5.66            714                   1.92

$100,000-199,999   5,359                 324               6.05            734                   2.11

$200,000-499,999   3,280                247                7.53            742                   2.66

$500,000-999,999      664                  61                9.19             654                  2.86

$1,000,000+                233                  27               11.59            839                  4.60

Totals                       70,714                2,415              3.42            618                   1.00

No surprises here either. As the price ranges increase, the percentage of stakes winners from foals also increases from 1.42% for the lowest group to 11.59% for the highest group. The average Performance Points per stakes winner also increases from 468 for the lowest group to 839 for the highest group (with some glitches along the way). Most importantly, the PPI (result) increases from 0.32 for the lowest group to 4.60 for the highest group. All of this merely confirms that the markets are more or less rational, at least in the macro sense (over a large body of data).

Now lets us compare prices with results.

Price Range               Foals          Price Index          PPI (Result)

$100-9,999              27,364              0.24                      0.32

$10,000-19,999      11,023              0.71                      0.66

$20,000-29,999       7,031              0.93                      0.89

$30,000-49,999       7,181              1.18                       1.29

$50,000-99,999       8,579              1.65                      1.92

$100,000-199,999   5,359             2.24                       2.11

$200,000-499,999   3,280            3.18                       2.66

$500,000-999,999      664            4.87                       2.86

$1,000,000+                233             6.78                      4.60

Totals                       70,714             1.00                       1.00

The first thing I notice is that the lowest three groups (up to $29,999) are all below 1.00 for both prices and results. The highest six groups ($30,000+) are all above 1.00 for both prices and results. So $30,000 is the price at which you start to receive above-average (1.00+) results for your money.

The most interesting thing is how prices stack up versus results. The lowest three groups are not too far off (0.24 to 0.32, 0.71 to 0.66, and 0.93 to 0.89). The middle three groups are a bit more erratic (1.18 to 1.29, 1.65 to 1.92, and 2.24 to 2.11). The highest three groups start to show some significant separation between prices and results, with the former higher than the latter (3.18 to 2.66, 4.87 to 2.86, and 6.78 to 4.60).

At this point I got out my old college stats book and refreshed my memory on correlation and regression. I suspect that in order to do a proper statistical analysis of these data you would need to treat them as 70,714 individual pieces (one for each foal). That is obviously beyond my available computing resources.

Just for grins, though, I decided to treat them as only nine individual pieces of data (one for each price range involved). X equals price, Y equals result. I lined up the nine pairs of numbers (as shown above) and crunched some numbers and came up with an equation (Y’ = 0.47 + 0.6X) to describe the relationship between variables X and Y. (I hope I calculated correctly!!!!). The results are shown below.

X (Price)            Y (Result)             Y’ (Predicted Result From Equation)

0.24                       0.32                        0.61

0.71                       0.66                        0.90

0.93                      0.89                        1.03

1.18                       1.29                        1.18

1.65                       1.92                        1.46

2.24                       2.11                        1.81

3.18                       2.66                       2.38

4.87                       2.86                       3.39

6.78                       4.60                      4.54

I was basically interested to see if the regression equation did a better job of predicting results from prices than a simple assumption that prices equal results (X = Y). It did not in five of the nine price ranges examined. Only at the highest three ranges ($200,000+) did the equation do a better job of prediction than assuming that X equals Y.

At $30,000-49,999 both predicted 1.18, but the actual results were 1.29. So I have to call that a tie. In the other five ranges the assumption that X = Y did a better job of prediction than the regression equation.

X matched up with Y (assumption that X = Y) better than with Y’ (regression equation) in the lowest six ranges (except for $30,000-49,999, which was a tie, as mentioned above). At the highest three ranges (beginning with $2o0,000), however, X starts to overestimate Y pretty severely.

Therefore, I conclude that assuming X equals Y works just fine up to about $200,000, at which point the regression equation did a better job of estimating Y from X. Only 4,177 of the 70,714 foals sold for $200,000+. (An average of $200,000 corresponds to a Price Index of 2.74.)

So I do not feel too bad about that, especially considering that almost all of the subpopulations I have examined have Price Indexes (X values) much closer to 1.00 than to 2.74 (which corresponds to an average of $200,000) or higher. Almost all of the subpopulations I have examined contain a wide variety of prices, both high and low, and most of those subpopulations cluster around a Price Index of 1.00.

In terms of which price range produced the best results from its prices, I would have to say that the big winner is $50,000-99,999. It had a Price Index of 1.65 and a PPI (result) of 1.92. (And regression predicted only 1.46). Not bad at all.

I should caution you to take that result with a grain of salt, however. That price range contained the three best stakes winners of the entire group of 70,714 foals and 2,415 stakes winners (Zenyatta, Curlin, and English Channel). Without those three this range had a PPI of 1.72, still better than 1.65 but much closer to it.

And as for the $1,000,000+ group, the regression equation just about nailed it. It predicted a result of 4.54. The actual result was 4.60.

Of course it is also possible that if I had the computing power to calculate all this as 70,714 individual pieces of data, regression analysis might have yielded a better and more accurate equation. I am not sure about this point. Any professional mathematicians out there care to enlighten me??????

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Without a Coon Again?????

  1. Byron Rogers says:

    David

    If you want to send me an excel sheet with the data for the 70,714 foals (I think that there may be a 65,000 limit on excel though) with their price and PPI I’d be happy to run it through our symbolic regression program and send you back the best model to describe the relationship. It uses cloud computing to generate it, so it shouldn’t take long to find a model with decent correlation.

    Byron.

    • ddink55 says:

      Thank you for the offer. I do not know if it would be worth the effort. I do not have Excel. I do have a spreadsheet function in word processing that is limited to 65,536 units. Do not know if that function has the same capabilities as Excel. Could send you the 2,415 stakes winners with their prices and Performance Points. The remaining 68,299 could be summarized by using the maverage for the whole group (38.95 for the 26,976 foals sold for $100-9,999 who were not stakes winners, for example). All non stakes winners would have a result (PPI) of zero. Would that work????? Would that be worth the effort????

  2. Byron Rogers says:

    I should add, it might be more interesting to add sex as a variable (1 for colts, 2 for fillies) as it could have an influence on the outcome given that high prices tend to be saturated with colts.

    • ddink55 says:

      Byron,

      Don’t bother responding to my last comment (unless you still want to). I won’t be needing to analyze 70,000+ pairs of data. The equation that satisfies almost all of my requirements is Y’ = .20 + 0.8X. Thanks for the input.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s