Help comparing column values
Results 1 to 5 of 5

Thread: Help comparing column values
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    Board Regular
    Join Date
    Jun 2002
    Posts
    58
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Help comparing column values

    I am working on a spreadsheet that includes the location of ~80k global cities as defined by an internal system (Gov), Bing and Google. I would like the audit the data to identify entries where the internal system may be incorrect or where there is significant inconsistency between sources. Since the values are rarely identical, I added columns for the straight line distances between points to analyze. I would like to add a column (manually added in the example below) that describes how the sources compare to each other

    - Are they all within 1Km of each other = Alignment
    - Are they all in conflict (greater than 1KM) = Conflict
    - Is Gov inconsistent with Bing & Google = Gov Variance
    - Is Bing inconsistent with Gov and Google = Bing Variance
    - Is Google inconsistent with Gov and Bing = Google Variance

    * Note, not trying to determine which value(s) is/are correct. I am looking for a general idea of how location definition varies.

    Sample data below. As always, thanks for the assist - NASG


    Excel 2016 (Windows) 32 bit
    ABCDEFGHIJKLMN
    1LocationGovBingGoogleGov vs. BingGov vs. GoogleBing vs. GoogleAlignmentAlignement Values
    2QueryLatLongLatLongLatLong(km)(km)(km)
    3New York, NY, US40.75348889-73.9809611140.7820015-73.8317031940.7127837-74.00594138.063.110.3ConflictAligned
    4Corpus Christi, TX, US27.79628889-97.4035527.79640961-97.4035568227.8005828-97.3963810.010.530.53AlignedConflict
    5ATL, GA, US33.640772-84.44786833.74831009-84.3911132833.7489954-84.38798248.128.230.19Gov VarianceGov Variance
    6Macon, GA, US32.83608056-83.6625805632.83967972-83.6275787432.8406946-83.63240222.051.780.29Gov VarianceBing Variance
    7Puerto Vallarta, , MX20.61448889-105.239311120.60713959-105.224372920.653407-105.22533161.092.843.2ConflictGoogle Variance
    8Plano, TX, US33.04641111-96.7448694433.0207901-96.6992492733.0198431-96.69888563.183.240.07Gov Variance
    9Halifax, NS, CA44.6488-63.5753611144.64960098-63.5946998644.6488625-63.57531960.950.010.95Aligned
    10EWR, NJ, US40.69069-74.1774140.69287872-74.1854476940.6895314-74.17446240.450.170.62Aligned
    11Milan, IT45.464219449.18978055645.451999669.19277000445.46542199.18592430.860.210.99Aligned
    12New Orleans, LA, US29.95368889-90.0777611129.95369911-90.0777511629.9510658-90.071532300.420.41Aligned
    13

    Sheet1




  2. #2
    New Member
    Join Date
    Jul 2016
    Location
    Darmstadt
    Posts
    11
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Help comparing column values

    Dear NASG,

    for my approach to get this done, there are some steps needed:


    1.) Geometric cases:

    for a complete picture, i suggest to consider to add 3 more cases. Even if rare, there is the possibility, that on coordinate is more or less in the middle of the two others. I call them "centred".


    2.) Lookup table


    On an other sheet (#2) create a lookup table like this:

    Byte Alignment
    0 Aligned
    1 GOV centred
    2 BING centred
    3 GOOGLE variance
    4 GOOGLE centred
    5 BING vary
    6 GOV variance
    7 Conflict

    And define a name to the table like "LUT" (for look up table)



    3.) Criteria maximum distance



    to make things more flexible, please also find space on sheet #2 to define your maximum allowed distance, in your case you mentioned 1 for 1 KM, and also define the name for it: "MD", for max distance.




    4.) assigning alignment text

    - first clear column M in your table


    - enter in L3: =IF(H3>MD,1,0)

    which tests if the distance is higher or lower your maximum distance, and if it is lower, outputs a "0"


    - auto fill column L, M and N, down to the last row of your table



    - enter in O3: =L3*2^2+M3*2+N3

    which will calulate a number from 0 to 7 (decimal) from these 3 "bits"


    - auto fill column O down to the last row of your table


    - enter in K3: =VLOOKUP(O3,LUT,2,FALSE)

    which will lookup the text from the look up table


    - auto fill column K down to the last row of your table



    5.) Remarks

    I hope I did understand your problem correctly.

    for further statistical analysis, the values in column O are easier to work with.


    Regrads, NvK

  3. #3
    Board Regular
    Join Date
    Jun 2002
    Posts
    58
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Help comparing column values

    Huzzah! That was awesome. Thank you!

  4. #4
    New Member
    Join Date
    Jul 2016
    Location
    Darmstadt
    Posts
    11
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Help comparing column values

    Quote Originally Posted by NotASuperGenius View Post
    Huzzah! That was awesome. Thank you!
    You are very welcome.

    btw: i am Interested in how you calculate the distance between 2 coordinates, since as far as I know one should use great circle approximation and that the distance are getting shorter when you are moving from equator to one of the poles ...

  5. #5
    Board Regular
    Join Date
    Jun 2002
    Posts
    58
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Help comparing column values

    I used a custom function from

    Calculate distance in Google spreadsheet | Winfred van Kuijk

    The biggest challenge was working around the API limits.

Some videos you may like

User Tag List

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •