Sentiment analysis - assigning positive, negative or neutral score

misstashnz

New Member
Joined
Jun 14, 2021
Messages
5
Office Version
  1. 365
Platform
  1. Windows
I'm trying to complete a sentiment analysis on product reviews. I have used a 'bag of words' method, and what I'm trying to achieve is to give each review a rating of '1' for positive, '0' for neutral, or '-1' for negative. There's ~35k reviews, so need a formula that I can just replicate down the list.

I've managed to get as far as giving a review a tick as positive if i contains one of my positive words, same as for negative, and then created an IF OR formula to specify it is positive if it only has positive words, negative if only negative words, and neutral if it contains both. So each review has individual columns for positive, neg, and neutral - only one column will display 'Yes'. From here, I'm really stuck in figuring out how to create a formula with three variable outputs - everything is just 'value if true' or 'value if false'.

Any bright ideas would be immensely appreciated!

Assessment 1 - Qualitative review.xlsx
IJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAPAQARASATAUAVAWAXAYAZBABBBCBDBEBFBGBHBIBJBKBLBM
1Positive wordsNegative words
2FastEasyGreatGoodLoveRecommendEnjoyAmaz..HappyExcellentHandyGladConvenientImpress..Beaut..QuickPerfectNiceUsefulAwesomeSturdyEasilyWonderfulFantasticInexpensivePositive?Positive ScoreSlowHardHateDislikeFreezeIssueDon't likeDifficultInconvenientAnnoyingHeavyUnnecessaryFlawPriceyProblemAwkwardHurtUglyBulkyBadDo not likeComplicateUnablePoorGlitchNegative?Negative ScoreOverall PositiveOverall NegativeOverall Neutral
30010000000000000000000000Yes10000000000000000000000000No0YesNoNo
40111000000000000000000000Yes30000000000000000000000000No0YesNoNo
50000000100000000000000000Yes10000000000000010000000000Yes1NoNoYes
60100000100000000110000000Yes40000000000000010000000000Yes1NoNoYes
70000000000100000000000000Yes10000000000000000000000000No0YesNoNo
81111001100010000000001000Yes80000000000000000000000000No0YesNoNo
90000000000000000000000000No00100000000000000000000000Yes1NoYesNo
100010100000000000000000000Yes20000000000000000000000000No0YesNoNo
111001000100000000000000000Yes30000000000000000010100000Yes2NoNoYes
120000000100000000010000000Yes20000000000100000000000000Yes1NoNoYes
130010000000000000010001000Yes30000000000000000000000000No0YesNoNo
140000000100000000010000000Yes20000000000000000000000000No0YesNoNo
150001000000000000010010000Yes30000010000000000000000000Yes1NoNoYes
exercise 1
Cell Formulas
RangeFormula
I3:I15I3=COUNTIF(H3,"*fast*")
J3:J15J3=COUNTIF(H3,"*easy*")
K3:K15K3=COUNTIF(H3,"*great*")
L3:L15L3=COUNTIF(H3,"*good*")
M3:M15M3=COUNTIF(H3,"*love*")
N3:N15N3=COUNTIF(H3,"*highly recommend*")
O3:O15O3=COUNTIF(H3,"*enjoy*")
P3:P15P3=COUNTIF(H3,"*amaz*")
Q3:Q15Q3=COUNTIF(H3,"*happy*")
R3:R15R3=COUNTIF(H3,"*excellent*")
S3:S15S3=COUNTIF(H3,"*handy*")
T3:T15T3=COUNTIF(H3,"*glad*")
U3:U15U3=COUNTIF(H3,"*convenient*")
V3:V15V3=COUNTIF(H3,"*impress*")
W3:W15W3=COUNTIF(H3,"*beaut*")
X3:X15X3=COUNTIF(H3,"*quick*")
Y3:Y15Y3=COUNTIF(H3,"*perfect*")
Z3:Z15Z3=COUNTIF(H3,"*nice*")
AA3:AA15AA3=COUNTIF(H3,"*useful*")
AB3:AB15AB3=COUNTIF(H3,"*awesome*")
AC3:AC15AC3=COUNTIF(H3,"*sturdy*")
AD3:AD15AD3=COUNTIF(H3,"*easily*")
AE3:AE15AE3=COUNTIF(H3,"*wonderful*")
AF3:AF15AF3=COUNTIF(H3,"*fantastic*")
AG3:AG15AG3=COUNTIF(H3,"*inexpensive*")
AH3:AH15AH3=IF(OR(I3=1,J3=1,K3=1,L3=1,M3=1,N3=1,O3=1,P3=1,Q3=1,R3=1, S3=1, T3=1, U3=1, V3=1, W3=1, X3=1, Y3=1, Z3=1, AA3=1, AB3=1, AC3=1, AD3=1, AE3=1, AF3=1, AG3=1),"Yes","No")
AI3:AI15,BJ3:BJ15AI3=SUM(I3:AG3)
AJ3:AJ15AJ3=COUNTIF(H3,"*slow*")
AK3:AK15AK3=COUNTIF(H3,"*hard*")
AL3:AL15AL3=COUNTIF(H3,"*hate*")
AM3:AM15AM3=COUNTIF(H3,"*dislike*")
AN3:AN15AN3=COUNTIF(H3,"*freez*")
AO3:AO15AO3=COUNTIF(H3,"*issue*")
AP3:AP15AP3=COUNTIF(H3,"*don't like*")
AQ3:AQ15AQ3=COUNTIF(H3,"*difficult*")
AR3:AR15AR3=COUNTIF(H3,"*inconvenient*")
AS3:AS15AS3=COUNTIF(H3,"*annoying*")
AT3:AT15AT3=COUNTIF(H3,"*expensive*")
AU3:AU15AU3=COUNTIF(H3,"*unnecessary*")
AV3:AV15AV3=COUNTIF(H3,"*flaw*")
AW3:AW15AW3=COUNTIF(H3,"*pricey*")
AX3:AX15AX3=COUNTIF(H3,"*problem*")
AY3:AY15AY3=COUNTIF(H3,"*awkward*")
AZ3:AZ15AZ3=COUNTIF(H3,"*hurt*")
BA3:BA15BA3=COUNTIF(H3,"*ugly*")
BB3:BB15BB3=COUNTIF(H3,"*bulky*")
BC3:BC15BC3=COUNTIF(H3,"*bad*")
BD3:BD15BD3=COUNTIF(H3,"*do not like*")
BE3:BE15BE3=COUNTIF(H3,"*complicate*")
BF3:BF15BF3=COUNTIF(H3,"*unable*")
BG3:BG15BG3=COUNTIF(H3,"*poor*")
BH3:BH15BH3=COUNTIF(H3,"*glitch*")
BI3:BI15BI3=IF(OR(AJ3=1,AK3=1,AL3=1,AM3=1,AN3=1,AO3=1,AP3=1,AQ3=1,AR3=1,AS3=1,AT3=1,AU3=1,AV3=1,AW3=1,AX3=1,AY3=1,AZ3=1,BA3=1,BB3=1,BC3=1,BD3=1,BE3=1,BF3=1,BG3=1,BH3=1),"Yes","No")
BK3:BK15BK3=IF(AND(AH3="Yes",BI3="No"),"Yes","No")
BL3:BL15BL3=IF(AND(AH3="No",BI3="Yes"),"Yes","No")
BM3:BM15BM3=IF(AND(AH3="Yes",BI3="Yes"),"Yes","No")
 

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
Is column H freeform text? If so, your current formulas could be giving misleading results. For example "My glove has an ugly fastener" would count two positive scores and one negative, despite it not containing any of your positive "words" as words. Is that a problem for you? That is, do we need to check for whole words rather than just text strings? If looking for whole words though I can see a problem with your examples like "Amaz" where I presume you are trying to cover off several possibilities.

If a text contained 6 positive words and one negative word can you confirm that you still want that result as 'Neutral'?
 
Upvote 0
Is column H freeform text? If so, your current formulas could be giving misleading results. For example "My glove has an ugly fastener" would count two positive scores and one negative, despite it not containing any of your positive "words" as words. Is that a problem for you? That is, do we need to check for whole words rather than just text strings? If looking for whole words though I can see a problem with your examples like "Amaz" where I presume you are trying to cover off several possibilities.

If a text contained 6 positive words and one negative word can you confirm that you still want that result as 'Neutral
Yeah, this is for one of my papers at uni and we're currently looking at ways to code qualitative data. We're aware that this method will not provide accurate results, but just learning how it works and understanding pros and cons (such as what you've mentioned).

Unfortunately the internet has not been kind in showing me ways to do this that don't use existing software or excel add-ins, so doing the best I can!
 
Upvote 0
What about something like this then?

misstashnz.xlsm
HIJKLMN
1PositiveNegative
2TextResultFastSlow
3It was fast and easyPositiveEasyHard
4Tom hates eggsNegativeGreatHate
5This is amazingly difficult and not fastNeutralGoodDislike
6Nothing here at allNeutralLoveFreeze
7Slow, hard, hate, dislikeNegativeRecommendIssue
8EnjoyDon't like
9AmazDifficult
10HappyInconvenient
11ExcellentAnnoying
12HandyHeavy
13GladUnnecessary
14ConvenientFlaw
15ImpressPricey
16BeautProblem
17QuickAwkward
18PerfectHurt
19NiceUgly
20UsefulBulky
21AwesomeBad
22SturdyDo not like
23EasilyComplicate
24WonderfulUnable
25FantasticPoor
26InexpensiveGlitch
Sheet2
Cell Formulas
RangeFormula
I3:I7I3=INDEX({"Neutral","Positive","Negative"},MOD((COUNT(SEARCH(M$2:M$26,H3))>0)+2*(COUNT(SEARCH(N$2:N$26,H3))>0),3)+1)
 
Upvote 0

Forum statistics

Threads
1,215,045
Messages
6,122,840
Members
449,096
Latest member
Erald

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top