Discussion Re: Difference in calculated correlation coefficient value between Office 2016 and Office 365 in Excel
https://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2709558#M112415
<P><LI-USER uid="1143876"></LI-USER> wrote: ``I get a (correct) R^2 vallue of 0.9838. I also get the same value using the <FONT color="#FF0000">=correl()</FONT> function. [....] if I load the file into Office365 (online), [...] the R^2 value on the chart changes to 0.997``</P><P> </P><P>(Errata.... I believe you mean RSQ(), not CORREL(). CORREL is "R", the square root of R^2.)</P><P> </P><P>I suspect that you select Set Intercept=0 for the trendline.</P><P> </P><P>And yes, the trendline calculation of "R^2" for zero intercept did change in Office 365, or so I'm told.</P><P> </P><P>But <STRONG><FONT color="#FF0000">that is a correction</FONT></STRONG>, not a bug.</P><P> </P><P>For details, see my last response in the thread at <A href="https://techcommunity.microsoft.com/t5/excel/same-xy-scatter-but-different-r-square/td-p/2456776" target="_blank" rel="noopener">https://techcommunity.microsoft.com/t5/excel/same-xy-scatter-but-different-r-square/td-p/2456776</A> .</P><P> </P><P>(But rereading it myself, that explanation is contorted and difficult to follow. Sigh.)</P><P> </P><P>In a nutshell, we can see the difference by looking at the "R^2" returned by LINEST, which is really the <STRONG><FONT color="#FF0000">"coefficient of determination" (CoD)</FONT></STRONG>.</P><P> </P><P>Usually, RSQ and CoD are the same. But they <STRONG><FONT color="#FF0000">differ under some conditions</FONT></STRONG>. One of those conditions is when a zero intercept is specified.</P><P> </P><P>With your posted data:</P><P> </P><P>=INDEX(LINEST(B1:B5,A1:A5,<STRONG><FONT color="#FF0000">TRUE</FONT></STRONG>,TRUE),3,1) returns <STRONG><FONT color="#FF0000">0.98379</FONT></STRONG>9508754327</P><P> </P><P>=INDEX(LINEST(B1:B5,A1:A5,<STRONG><FONT color="#FF0000">FALSE</FONT></STRONG>,TRUE),3,1) returns <STRONG><FONT color="#FF0000">0.99701</FONT></STRONG>2717209636</P><P> </P><P>which round to 0.9838 and 0.9970 respectively.</P><P> </P><P>Excerpts from my previous explanation:</P><P> </P><P>In general, the CoD is calculated by the formula 1 - SSres/SStot, or equivalently SSreg/SStot. SStot is Sigma((Y - avgY)^2), where "Y" is the original data.</P><P> </P><P>[In contrast, see the RSQ help page to see how truly R^2 is calculated.]</P><P> </P><P>That formula is <STRONG><FONT color="#FF0000">always used to calculate the linear trendline</FONT> <FONT color="#FF0000">"R^2"</FONT></STRONG>, at least in Excel 2010 (which I use) and in 2013, 2016 and 2019 [...].</P><P> </P><P>That formula is <STRONG><FONT color="#FF0000">also used to calculate LINEST "R^2"</FONT></STRONG> when "zero intercept" is not specified (const = TRUE).</P><P> </P><P>However, <STRONG><FONT color="#FF0000">when "zero intercept" is specified</FONT></STRONG> (const = FALSE), <STRONG><FONT color="#FF0000">LINEST calculates SStot differently</FONT></STRONG>, namely: Sigma(Y^2).</P><P>[....]</P><P>I am told that in Office 365 Excel, <STRONG><FONT color="#FF0000">the calculation of the linear trendline "R^2" has been changed to agree with LINEST "R^2"</FONT></STRONG> when "zero intercept" is specified.</P><P> </P><P>----</P><P> </P><P>That difference in calculation of CoD for "zero intercept" is intentional. See the LINEST help page. It is based on an esoteric academic rationale, which not all statistics scientists agree with.</P><P> </P>Thu, 02 Sep 2021 02:06:38 GMTJoe User2021-09-02T02:06:38ZDifference in calculated correlation coefficient value between Office 2016 and Office 365
https://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2709119#M112395
<P>When I create a scatter plot in Excel (2016) and add trendline to this data I get a (correct) R^2 vallue of 0.9838. I also get the same value using the =correl() function. </P><P>However, if I load the file into Office365 (online), without changing anything myself, the R^2 value on the chart changes to 0.997. This does not change the value given by the correl() function. Is there a difference in the way the R^2 value that is shown with the trendline is calculated and displayed between the versions of Excel? Is this a bug?</P><P>I also reported this throught the app at the end of 2020 but I have had no reply or acknowledgement. </P><TABLE width="144"><TBODY><TR><TD width="64" height="19">Concentration</TD><TD width="80">Absorbance</TD></TR><TR><TD height="19">0.1</TD><TD>0.323</TD></TR><TR><TD height="19">0.2</TD><TD>0.521</TD></TR><TR><TD height="19">0.3</TD><TD>0.981</TD></TR><TR><TD height="19">0.4</TD><TD>1.243</TD></TR><TR><TD height="19">0.5</TD><TD>1.478</TD></TR></TBODY></TABLE>Wed, 01 Sep 2021 20:16:44 GMThttps://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2709119#M112395Dave_Gerrard2021-09-01T20:16:44ZRe: Difference in calculated correlation coefficient value between Office 2016 and Office 365
https://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2709558#M112415
<P><LI-USER uid="1143876"></LI-USER> wrote: ``I get a (correct) R^2 vallue of 0.9838. I also get the same value using the <FONT color="#FF0000">=correl()</FONT> function. [....] if I load the file into Office365 (online), [...] the R^2 value on the chart changes to 0.997``</P><P> </P><P>(Errata.... I believe you mean RSQ(), not CORREL(). CORREL is "R", the square root of R^2.)</P><P> </P><P>I suspect that you select Set Intercept=0 for the trendline.</P><P> </P><P>And yes, the trendline calculation of "R^2" for zero intercept did change in Office 365, or so I'm told.</P><P> </P><P>But <STRONG><FONT color="#FF0000">that is a correction</FONT></STRONG>, not a bug.</P><P> </P><P>For details, see my last response in the thread at <A href="https://techcommunity.microsoft.com/t5/excel/same-xy-scatter-but-different-r-square/td-p/2456776" target="_blank" rel="noopener">https://techcommunity.microsoft.com/t5/excel/same-xy-scatter-but-different-r-square/td-p/2456776</A> .</P><P> </P><P>(But rereading it myself, that explanation is contorted and difficult to follow. Sigh.)</P><P> </P><P>In a nutshell, we can see the difference by looking at the "R^2" returned by LINEST, which is really the <STRONG><FONT color="#FF0000">"coefficient of determination" (CoD)</FONT></STRONG>.</P><P> </P><P>Usually, RSQ and CoD are the same. But they <STRONG><FONT color="#FF0000">differ under some conditions</FONT></STRONG>. One of those conditions is when a zero intercept is specified.</P><P> </P><P>With your posted data:</P><P> </P><P>=INDEX(LINEST(B1:B5,A1:A5,<STRONG><FONT color="#FF0000">TRUE</FONT></STRONG>,TRUE),3,1) returns <STRONG><FONT color="#FF0000">0.98379</FONT></STRONG>9508754327</P><P> </P><P>=INDEX(LINEST(B1:B5,A1:A5,<STRONG><FONT color="#FF0000">FALSE</FONT></STRONG>,TRUE),3,1) returns <STRONG><FONT color="#FF0000">0.99701</FONT></STRONG>2717209636</P><P> </P><P>which round to 0.9838 and 0.9970 respectively.</P><P> </P><P>Excerpts from my previous explanation:</P><P> </P><P>In general, the CoD is calculated by the formula 1 - SSres/SStot, or equivalently SSreg/SStot. SStot is Sigma((Y - avgY)^2), where "Y" is the original data.</P><P> </P><P>[In contrast, see the RSQ help page to see how truly R^2 is calculated.]</P><P> </P><P>That formula is <STRONG><FONT color="#FF0000">always used to calculate the linear trendline</FONT> <FONT color="#FF0000">"R^2"</FONT></STRONG>, at least in Excel 2010 (which I use) and in 2013, 2016 and 2019 [...].</P><P> </P><P>That formula is <STRONG><FONT color="#FF0000">also used to calculate LINEST "R^2"</FONT></STRONG> when "zero intercept" is not specified (const = TRUE).</P><P> </P><P>However, <STRONG><FONT color="#FF0000">when "zero intercept" is specified</FONT></STRONG> (const = FALSE), <STRONG><FONT color="#FF0000">LINEST calculates SStot differently</FONT></STRONG>, namely: Sigma(Y^2).</P><P>[....]</P><P>I am told that in Office 365 Excel, <STRONG><FONT color="#FF0000">the calculation of the linear trendline "R^2" has been changed to agree with LINEST "R^2"</FONT></STRONG> when "zero intercept" is specified.</P><P> </P><P>----</P><P> </P><P>That difference in calculation of CoD for "zero intercept" is intentional. See the LINEST help page. It is based on an esoteric academic rationale, which not all statistics scientists agree with.</P><P> </P>Thu, 02 Sep 2021 02:06:38 GMThttps://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2709558#M112415Joe User2021-09-02T02:06:38ZRe: Difference in calculated correlation coefficient value between Office 2016 and Office 365
https://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2710841#M112459
<P><LI-USER uid="146717"></LI-USER> Thank you very much. That does explain the values I am getting. </P><P> </P><P>I think it is poor of MS to change the underlying default formula of the trendline "R^2" displayed on the charts without making it clear or renaming it in someway. </P><P>The way I am doing it is to take a single file in Office 2016 and re-open it in Office365. The intercept is NOT set at zero in the first but Office365 then re-calculates the value and displays the altered value in the same manner. Is there a list somewhere of similar alterations that are made between versions? Where I work, we have several versions running over a large number of people, this kind of thing makes it very difficult to spot when someone has made an error or whether Excel is altering the results.</P><P>I will now add this to my list of reasons to be wary of Excel. </P><P> </P><P>[Incidentally: to explain the errata (for anyone following closely) - I was using CORREL()^2 rather than RSQ() (they are equivalent) and forgot to type the "^2" in my question. Thanks <LI-USER uid="146717"></LI-USER> for spotting that too.]</P><P> </P>Thu, 02 Sep 2021 09:22:23 GMThttps://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2710841#M112459Dave_Gerrard2021-09-02T09:22:23ZRe: Difference in calculated correlation coefficient value between Office 2016 and Office 365
https://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2713298#M112563
<P><LI-USER uid="1143876"></LI-USER> wrote: ``The intercept is NOT set at zero in the first but Office365 then re-calculates the value and displays the altered value in the same manner``</P><P> </P><P>If you are saying that Office 365 Excel automagically __changed__ your Set Intercept=0 option from not selected to selected, or that it calculates R^2 __as_if__ that option is selected when it __is_not__, <STRONG><FONT color="#FF0000">that is certainly a defect</FONT></STRONG>.</P><P> </P><P>IMHO, you should report it. I believe the most effective way to do that is to use the Feedback feature. Do not bother posting to excel.uservoice.com. That is a waste of time, IMHO.</P><P> </P><P>(PS.... Oh good: excel.uservoice.com is now officially defunct. But do not follow MSFT's suggestion to use techcommunity.microsoft.com -- a total waste of time -- or MSFT Store.)</P><P> </P><P>I do not have the Feedback feature in my version of Excel. But I have read that the best way to ensure that someone at MSFT actually pays attention to it is:</P><P> </P><P>1. After clicking Feedback, select "I Don't Like Something". (Not "I have a suggestion".)</P><P> </P><P>2. Select all three checkboxes: Attach My Logs to Help Troubleshoot, Include Screenshot, and You Can Contact Me About This Feedback.</P><P> </P><P>3. Provide a valid e-mail address that you actually check.</P><P> </P>Thu, 02 Sep 2021 20:05:24 GMThttps://techcommunity.microsoft.com/t5/excel/difference-in-calculated-correlation-coefficient-value-between/m-p/2713298#M112563Joe User2021-09-02T20:05:24Z