Log-Linear Regression Model Decomposition

Nick · Feb 1, 2008

Hi,

I have estimated a log linear regression model in Excel with the following functional form:

lnY = a + XB1 + Xb2 + Xb3 + Xb4

The dependent variable is in log form, the independent/explanatory variables in linear form. With the equation I can estimate/forecast the linear value of Y by taking the antilog/exponent of the forecast from the equation so that I can see the value in the orginal Y values instead of the logs. This is fine. But I also want to decompose the forecast/estimate by the respective explanatory X variable.

For example if the total forecast in log form = 5, then the anti log/exponent of that gives me a forecast of 148 in the original Y series. Now of that 148, what I need to calculate is how much is X1, X2 etc is worth, e.g

a = 2
X1 = 45
X2 = 15
X3 = 25
x4 = 61

Total = 148

Does anyone know how to do this?

Thanks,

Nick

Antonio Toledo · Feb 1, 2008

To come up with a forecast of 5 in lognormal form you need to input the independent variables... these independents have not been transformed.
How does your regression output look like?

A simple example,

if sales per year can be predicted by marketing costs with this equation:
LN(Sales) = a + B1 * X1
= 1 + 0,5*X1

And if our marketing cost one year is 3 then our log sales is 2,5
Meaning that with marketing costs of 3 our expected sales is ~12

Nick · Feb 2, 2008

Hello Antonio,

Yes I realise I would need to input the indepentent variables to get the actual forecast whether in log

or orginal Y. This I know how to do as per your example which I agree with. But continuing with your example, here's what I want to do.

Lets say sales is also a function of other variables,

LN(Sales) = a + B1 * X1 + B2 * X2 + B3 * X3

Where X1 = Marketing, X2 = weather, X3 = economy and each B represents the estimated beta coefficient associated with each respective variable: b1=0.5, b2=0.2 and b3=1.3 and a = 1

Based on the above:

Forecast of sales in Logs = 7.9
Forecast of actual sales (Y in the orginal series) = 2595

Now of the 2595 forecast, I want to want to breakdown it's composition in terms of the independent variables, i.e how much does each of the three independent variables contribute to making up the sales forecast. For example marketing might contribute 800 dollars of sales, weather 300 dollars, the economy 1000 dollars and the alpha (a) 495 dollars.

Hope this makes sense

Antonio Toledo · Feb 3, 2008

I don't think that is possible...

The relationship between sales and its independent variable is like this when Y is antiloged.

Y = e^a*e^(b1*x1)*e^(b2*x2)*e^(b3*x3)

Since the total is a product of it's components you can't make claims like how much of sales comes from marketing. Since the product of two variables is larger than their sum one could make the claim of synergies.

Although you could make the argument, all other variables constant, how much would Y increase if we increased marketing?
Or how much would Y be if we had no marketing?

Legacy 14611 · Feb 3, 2008

Couple of comments:

Your model won't give you linear unbiased forecasts of Y (unbiased in the sense that the mean or expected value of the difference between the forecast and the eventuating value of Y is zero). It may give you linear unbiased forecasts of Ln

but it seems that Ln(Y) isn't really your main aim. Are you sure your regression model is optimal for the purpose for which you want to use it?

If your model were linear in Y it would clearly be easy enough to determine the amount by which Y changes if (say) X1 changes by a specified amount and the other explanatory variables don't change - the answer to this is just the coefficient B1 and it doesn't depend on the values of any of the variables in the situation that you wish to forecast. I assume this is what you mean by decomposing the forecast into the contributions of X1, X2 etc.
In your model - nonlinear in Y - the contribution of any of the explanatory variables does depend upon the value of Y. You can easily find this out by differentiating if your variables are differentiable, if not by using a finite differences approach.
Specifically, in your model as it stands, the rate of change in Ln

when (say) X1 changes (other explanatory variables remaining unchanged) is constant and equal to B1. However the rate of change of Y (if only X1 changes) is equal to B1*Y and thus does depend upon the value of Y, or equivalently on the values of all of X1, X2, X3 and X4.
You can't find out any more than this from your model as it stands.
Again, I have to ask if you are sure your model is optimal for the purposes for which you want to use it.

FellowExcellor · Feb 4, 2008

Hi,

Yep, I know how to do the decomposition in the linear model (where all variables are in linear form including Y) but I suspect the log formation is a better more realistic real life representation than the pure linear one which assumes that the Y variable responds in a proportional way to changes in the X variables not matter what value the Y takes on. In reality it's likely to be non-linear, at different values of y (e.g very high values) Y may repond differently to changes in the Xs than at say at very low values - in the case of the sales example for instance because of diminishing returns etc.

I think my specification is indeed correct as I'm looking for a multiplicative model rather than an additive one as the linear specification would imply.

Rugila, you mention differentiating my variables or using a finite differences approach. Would this give me the decomposition I am looking for? I was assuming you said it couldn't be done!
I know how to do regression analysis, I'm just not clued up on the mathematics aspects.

Is there perhaps any other transformations I could apply to the raw data to estimate/replicate a multiplicative/non-linear model instead of the log model that would more easily allow me to get the decomposition I need? What about standardizing each variable by divided each observation by the mean of the series and using this in the regression? Would that perhaps give me what I want.

Thanks,

Nick

Nick · Feb 4, 2008

Hi,

Yep, I know how to do the decomposition in the linear model (where all variables are in linear form including Y) but I suspect the log formation is a better more realistic real life representation than the pure linear one which assumes that the Y variable responds in a proportional way to changes in the X variables not matter what value the Y takes on. In reality it's likely to be non-linear, at different values of y (e.g very high values) Y may repond differently to changes in the Xs than at say at very low values - in the case of the sales example for instance because of diminishing returns etc.

I think my specification is indeed correct as I'm looking for a multiplicative model rather than an additive one as the linear specification would imply.

Rugila, you mention differentiating my variables or using a finite differences approach. Would this give me the decomposition I am looking for? I was assuming you said it couldn't be done!
I know how to do regression analysis, I'm just not clued up on the mathematics aspects.

Is there perhaps any other transformations I could apply to the raw data to estimate/replicate a multiplicative/non-linear model instead of the log model that would more easily allow me to get the decomposition I need? What about standardizing each variable by divided each observation by the mean of the series and using this in the regression? Would that perhaps give me what I want.

Thanks,

Nick

Legacy 14611 · Feb 4, 2008

Hi Nick (=FellowExcellor??)

OK. A few more comments.

1. You are not very specific just what the "decomposition you are looking for" is. I made a suggestion - a common one with this sort of stuff, that it can be interpreted as the rate of change of the variable to be explained when one of the explanatory variables changes, the others being supposed hypothetically constant for the purposes of the exercise. This is usually done by partial differentiation. In my earlier post I gave the result for this, but if you mean something different by "decompose" you will need to explain it more clearly.

2. Another possibility (to interpret “decompose”, which may or may not coincide with the first, is to use a mathematical result called Taylor's Theorem. You are almost certainly using this implicitly when writing down your regression model in the first place, but since you say you are "not clued up on mathematical aspects" looking into it further is unlikely to be useful to you in the present context. If you wish you could implement it by writing your model as something like Y=a +b*X1 +c*X1^2 + etc. This would give you some of the curvature properties you seem to want.

3. A "multiplicative/non linear model" is more usually written in the form as Ln

=a + Ln(X1) + etc. This is typically easier to do regressions with, do subsequent mathematical, predictive etc. or other work with, has curvature properties which are often useful (almost certainly would be in your context), easier to interpret, etc. The way you write your model is not necessarily incorrect, just inconvenient for further use, and unless you have compelling reasons for supposing it is the most valid and reasonable form I think (especially with a limited mathematical background) you'd be better off using a more standard model formulation.

4. "diminishing returns etc." refers to a specific type of curvature property. However many nonlinear regression formulations can express this curvature property, and again I think you'd be better (again, unless you have compelling reasons otherwise) to stick to more standard formulations.

5. Using finite differences is also unlikely to be useful to you. George Boole (of Boolean repute say when you Dim excel variables as “boolean”) wrote a book on this back in the 1800’s and there’s been a fair bit of stuff done since. Basically, data for regression variables are usually collected at discrete time intervals (daily, weekly or whatever). The differentiation usually used to analyse “diminishing returns” or other curvature properties is logically not applicable to functions only defined at discrete intervals without putting considerable faith in “it’s a good enough approximation for my purposes” or similar, which at best constitute rather a leap of faith. Some good work has been done in recent years on getting results for data collected in continuous time (or very close to it) but most of this is pretty esoteric stuff .

Nick · Feb 6, 2008

Hi,

In terms of the decomposition I am looking for. Well I think I understand exactly how to interpret the beta regression coeffeicents in a regression model in both linear and non-linear (log) frameworks. So in the case of: Ln (Y) = a +X1 , the x1 coefficent measures the % change in Y associated in unit change in X, all other variables held constant. Where Both X and Y variables are in logs gives you the elasticty of Y to X, what a one % change in X leads to in terms of a % change in Y.

Someone please correct me if I'm wrong! Now in terms of the final output (decomposition) I need from the model, it is what I highlighted in my previous posts with the sales example with three variables/drivers - how much in Y terms does each x variable contribute for a given observation of Y. If still unclear, I can email you an example of what I mean.

Now please correct me if I'm wrong but it seems to me that semilog (log of Y only) or full log/multiplicative model (log of Y and Xs) will not allow me to do this, not in terms of the orginal Y values anyway. Only the linear model does. It seems I need to look for an alternative non-log transformation to apply to the Y or X variables that would allow me to capture the non-linear aspects AND allow me to carry out the decomposition in the orginal Y variable form.

Thanks,

Nick

Legacy 14611 · Feb 6, 2008

Hi Nick,

It seems to me you are seeking properties of linearity to be valid for nonlinearity and (as far as I can tell) also wanting the reverse. I don’t know of any functional form that has these properties.

Apart from the earlier suggestions made earlier, possibly stepwise linearity might more or less suit you. That is, instead of a universally straight line (or flat plane or whatever other dimension of hyperspace you are interested in) you could use a set of joined (i.e. continuous) straight lines with different slopes. This would give a response of Y to changes in X1 etc (decomposition??) that are independent of changes in the other variables, but would depend on which range of X1 you happen to be using at the time.

The only other interpretation I can envisage is that maybe you are looking for functional types which have been called “additively separable” with the untransformed value for Y on the LHS. Something like Y=a + b* f(X1)+ c*g(X2) + etc., where the f and g are transformations chosen to give the curvature properties you apparently want. With these the response of Y to X1 depends on X1 only (although it is independent of X1 only in the linear case). These sorts of function were analysed by Wassily Leontief (look him up on Google) in a 1947 paper entitled ”Introduction to a Theory of the Internal Structure of Functional Relationships” where he considered situations which may be similar to what you want. Likely other similar stuff has been done since.

That all I can say on this. I’m off sailing for a week or so. Hope it has been of some help. I wish you well in your project.

Log-Linear Regression Model Decomposition

Nick

Board Regular

Excel Facts

Antonio Toledo

Board Regular

Nick

Board Regular

Antonio Toledo

Board Regular

Legacy 14611

Guest

FellowExcellor

Board Regular

Nick

Board Regular

Legacy 14611

Guest

Nick

Board Regular

Legacy 14611

Guest

Similar threads

Forum statistics

Share this page

Log-Linear Regression Model Decomposition

Nick

Board Regular

Excel Facts

Antonio Toledo

Board Regular

Nick

Board Regular

Antonio Toledo

Board Regular

Legacy 14611

Guest

FellowExcellor

Board Regular

Nick

Board Regular

Legacy 14611

Guest

Nick

Board Regular

Legacy 14611

Guest

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock