Polynomial Trendlines

TOC1963 · Sep 29, 2006

Ok here's what i'm trying to do. I have a set of x and y coordinates and i'm generating a polynomial equation from them in excel using an xy scatter plot. So I end up with my equation y=ax^4+bx^3..... and that seems to be working well. But I also need to solve for x and I thought I could do the same thing but reverse my coorinates. When I try to fit a curve to these points i'm getting a dip on my curve that gets larger as I increase the order of the equation. So if I plug the x values that I used to generate the curve I don't get anywhere near the y values that I used. Can anyone shed some light as to whey this is works great with my original coordiantes but not when I reverse them?
Thanks,
John

Jon Peltier · Sep 29, 2006

What's the function look like, for instance, what are the poly coordinates that you got? Format the trendline equation as scientific with 15 digits, so we know it's not just rounding error.

Some curves cannot be fitted both ways. If you don't have a unique x for each y, you will not be able to get a valid fit.

Also keep in mind that y=f(x) assumes the error is in y, whereas x=g

assumes it's all in x.

TOC1963 · Sep 29, 2006

x y
0.09 11.212
0.95 13.154
2.07 13.883
3.25 14.492
3.59 14.528

from these points I get the equation
y=-0.091x^4+0.8168x^3-2.6759x^2+4.3136x+10.845

if I reverse them
x y
11.212 0.09
13.154 0.95
13.883 2.07
14.492 3.25
14.528 3.59

I get
y = 2.4987x^4 - 131.82x^3 + 2600.5x^2 - 22732x + 74266

Oaktree · Sep 29, 2006

Consider adding a 6th point to your second chart for x = 10.8448612341, y = 0. The trendline should adjust to something more in line with your expectation.

Jon Peltier · Sep 29, 2006

You're overfitting the crap out of those numbers. Do you have some physical basis for picking a 4th order polynomial, or do you just want something to fit through all of the points?

A second order fit comes pretty close to the switched YX values, so does an exponential fit. For your original XY values, a logarithmic fit isn't bad, nor is 3rd order.

For simplicity, you should use the fit that looks pretty good with the fewest parameters, unless you have some special hypothesis about why X and Y are tied together by a high order polynomial.

Jon Peltier · Sep 29, 2006

Consider adding a 6th point to your second chart for x = 10.8448612341, y = 0. The trendline should adjust to something more in line with your expectation.

If he's trying to make sense of real data, this is akin to cheating, because it adds data which isn't there. Plus it might encourage him to use an even higher order fit. Then fudge the intercept a little to make that one even better.

TOC1963 · Sep 30, 2006

It sounds like i'm going about this the wrong way...let me explain the application in more detail and maybe someone will have a better solution. I'm working a on a job where we're blending skim milk and cream together to obtain a desired butterfat. The problem is that there isn't a sensor that measures butterfat on the fly so we need to use a sensor that measures solids which is indicative of butterfat but is not a linear relationship. So the x and y coordinates I gave were the butterfat of known milk samples and the corresponding analog signal coming from the sensor. So based on those points I wanted to fit a curve and get the equation of that curve and then I could solve for the analog signal I need from the sensor for any butterfat milk I want to make. So I really need to the curve to pass directly through my points to truly be accurate in this process. The second part of this problem is I need to display the current butterfat being produced, or solve the equation for x and plug in the current analog signal to solve for it. So I thought I could just reverse the coordiantes and go through the same process to obtain this equation and that's where I started to run into this problem. So basically I need to develope an accurate curve from my known data and be able to obtain an x coordinate from a known y and also obtain a y coordinate from a known x. Any suggestions?
I appreciate the help.

Jon Peltier · Oct 2, 2006

The whole idea behind regression is that there is error in measurements, and regression gives you the best fit through the data, minimizing the total error between all of the points and the fitted relationship. To get the best fit, you want a reasonable number of points spread evenly through the range of interest.

Your intention is a fit that goes exactly through each point, and this assumes that there's no variability in measurement and in other factors, which is highly unlikely. If I get some milk product with 3.25% butterfat and I make ten measurements, are all 10 going to give a reading of 14.528 on your sensor? What if the temperature varies slightly, or I measure before stirring, then after stirring, then after shaking the sample like crazy? What is the effect of different distributions of fat globule size? And how closely are you tring to measure butterfat? 3.25 ± 0.10? 3.25 ± 0.05? 3.25 ± 0.001?

Have you decided which is your independent variable (X) and which is your dependent variable (Y)? Regression assumes the values of the independent variable do not contribute to the error in the fit, so that the error is a combination of measurement error and the effect of other factors (noise factors, like temperature and so forth).

Also, whichever way you look at your points, you have two tightly spaced points with three other widely spaced points, and those two points have an inordinate influence on the fitted relationship, evidenced by the second relationship which fit very poorly.

What I would suggest is that you make a number of standard mixtures, more than five, evenly spread across the range of interest. Maybe 0 to 4% in increments of 0.5%. Have your samples at the same temperature as what your regular samples will be at, give them the same amount of agitation, make everything as near production measurement conditions as possible. Take your measurements. Repeat on another day. Look carefully at the results. Think hard, because it is hard to justify physically the use of a relationship beyond a first order fit. Don't use a statistical method to justify uning a nice curve that connects all of the points.

If what you want is just a smooth calibration curve, then draw it by hand. If you want it in Excel, then manually insert many points along your range where you think it looks nice, put the XY pairs into a lookup table, and use an interpolation formula (Google: Excel Interpolation) to get Y as a function of X.

Polynomial Trendlines

TOC1963

New Member

Excel Facts

Jon Peltier

MrExcel MVP

TOC1963

New Member

Oaktree

MrExcel MVP

Jon Peltier

MrExcel MVP

Jon Peltier

MrExcel MVP

TOC1963

New Member

Jon Peltier

MrExcel MVP

Similar threads

Forum statistics

Share this page

Polynomial Trendlines

New Member

Excel Facts

MrExcel MVP

New Member

MrExcel MVP

MrExcel MVP

MrExcel MVP

New Member

MrExcel MVP

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock