Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Golf majors; retreive data from Wikipedia articles
Thanks Thanks: 0 Likes Likes: 0

  1. #1
    Board Regular DRSteele's Avatar
    Join Date
    Mar 2015
    Location
    Calgary
    Posts
    1,969
    Post Thanks / Like
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)

    Default Golf majors; retreive data from Wikipedia articles

    Can someone please help me?

    I am trying to use PowerQuery to create a table in Excel of all the golf majors. I wish to extract their dates from the Wikipedia articles for each tournament.

    For example, from this web address https://en.wikipedia.org/wiki/1895_U.S._Open_(golf) I want the contents of the Dates field in the box on the upper right side, October 4 1895 in this case. Then 1896, then 1897 all the way to 2019. I reckon it's just some kind of parameter process but I can't seem to get it right.
    Windows10, Excel 365 Insider
    If you don't declare what version of Excel, we will assume
    Windows10, Excel 365.
    Formulas here are always in
    Green background
    You can get the HTML Maker: https://www.mrexcel.com/forum/excel-...ins-links.html.

  2. #2
    Board Regular sandy666's Avatar
    Join Date
    Oct 2015
    Posts
    2,335
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    1 Thread(s)

    Default Re: Golf majors; retreive data from Wikipedia articles

    I know you know but I forgot my Crystal Ball and don't know what you know



    In the first post, show the type of machine (PC / Mac) and the Office version you are working on
    impossible things we do on the spot. for miracles you need to wait for a while

  3. #3
    Board Regular DRSteele's Avatar
    Join Date
    Mar 2015
    Location
    Calgary
    Posts
    1,969
    Post Thanks / Like
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)

    Default Re: Golf majors; retreive data from Wikipedia articles

    Unless I'm missing something, that page does not contain the dates. It is important that I have a method of retreiving various pieces of data from scores of web pages, not just the dates.
    Windows10, Excel 365 Insider
    If you don't declare what version of Excel, we will assume
    Windows10, Excel 365.
    Formulas here are always in
    Green background
    You can get the HTML Maker: https://www.mrexcel.com/forum/excel-...ins-links.html.

  4. #4
    Board Regular sandy666's Avatar
    Join Date
    Oct 2015
    Posts
    2,335
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    1 Thread(s)

    Cool Re: Golf majors; retreive data from Wikipedia articles

    maybe something like this:

    Code:
    // Select
    let
        Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
        #"Invoked Custom Function" = Table.AddColumn(Source, "fnUSOpen", each fnUSOpen([Select])),
        #"Expanded fnUSOpen" = Table.ExpandTableColumn(#"Invoked Custom Function", "fnUSOpen", {"Column1", "Column2"}, {"Column1", "Column2"})
    in
        #"Expanded fnUSOpen"
    
    // fnUSOpen
    let
        YEAR=(year)=>
    let
        Source = Web.Page(Web.Contents("https://en.wikipedia.org/wiki/"&(Text.From(year))&"_U.S._Open_(golf)")),
        Data0 = Source{0}[Data]
    in
        Data0
    in
        YEAR
    Select Select Column1 Column2
    2018 2018 Tournament information Tournament information
    2018 Dates June 14–17, 2018
    2018 Location Shinnecock Hills, New York
    40°53'38”N 72°26'24”W? / ?40.894°N 72.440°WCoordinates: 40°53'38”N 72°26'24”W? / ?40.894°N 72.440°W
    2018 Course(s) Shinnecock Hills Golf Club
    2018 Organized by USGA
    2018 Tour(s) PGA Tour
    European Tour
    Japan Golf Tour
    2018 Statistics Statistics
    2018 Par 70
    2018 Length 7,440 yards (6,800 m)
    2018 Field 156 players, 67 after cut
    2018 Cut 148 (+8)
    2018 Prize fund $12,000,000
    €10,244,150
    2018 Winner's share $2,160,000
    €1,843,947
    2018 Champion Champion
    2018 Brooks Koepka Brooks Koepka
    2018 281 (+1) 281 (+1)
    2018 Location Map Location Map
    2018 Shinnecock Hills
    Location in the United States
    Shinnecock Hills
    Location in New York
    Shinnecock Hills
    Location in the United States
    Shinnecock Hills
    Location in New York
    2018 ‹ 2017
    2019 ›
    ‹ 2017
    2019 ›
    Last edited by sandy666; Jun 17th, 2019 at 07:38 PM.
    I know you know but I forgot my Crystal Ball and don't know what you know



    In the first post, show the type of machine (PC / Mac) and the Office version you are working on
    impossible things we do on the spot. for miracles you need to wait for a while

  5. #5
    Board Regular DRSteele's Avatar
    Join Date
    Mar 2015
    Location
    Calgary
    Posts
    1,969
    Post Thanks / Like
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)

    Default Re: Golf majors; retreive data from Wikipedia articles

    Thanks Sandy. I had to alter and manipulate all kinds of things in the query, but I finally got 448 dates for golf's four major tournaments into one table. Not all years' webpages for each tourney are identical in shape and layout, so it took some doing.

    It's still a mess from a formatting standpoint and dates before 1900 don't work in Excel. Certainly though this was much quicker than transcribing them all manually...wow. Much obliged.
    Last edited by DRSteele; Jun 17th, 2019 at 10:58 PM.
    Windows10, Excel 365 Insider
    If you don't declare what version of Excel, we will assume
    Windows10, Excel 365.
    Formulas here are always in
    Green background
    You can get the HTML Maker: https://www.mrexcel.com/forum/excel-...ins-links.html.

  6. #6
    Board Regular sandy666's Avatar
    Join Date
    Oct 2015
    Posts
    2,335
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    1 Thread(s)

    Cool Re: Golf majors; retreive data from Wikipedia articles

    structure of the tables before 1987 is different than >=1987

    Select Title Label Detail
    1897
    Tournament information Dates September 17, 1897
    1897
    Tournament information Location Wheaton, Illinois
    1897
    Tournament information Course(s) Chicago Golf Club
    1897
    Tournament information Organized by USGA
    1897
    Tournament information Format Stroke play - 36 holes
    1897
    Statistics Field 35
    1897
    Statistics Prize fund $335
    1897
    Statistics Winner's share $150



    Select Title Label Detail
    1996
    Tournament information Tournament information
    1996
    Dates June 13–16, 1996
    1996
    Location Bloomfield Hills, Michigan
    1996
    Course(s) Oakland Hills Country Club
    South Course
    1996
    Organized by USGA
    1996
    Tour(s) PGA Tour
    1996
    Statistics Statistics
    1996
    Par 70
    1996
    Length 6,974 yards (6,377 m)
    1996
    Field 156 players, 108 after cut
    1996
    Cut 148 (+8)
    1996
    Prize fund $2.4 million
    1996
    Winner's share $425,000
    1996
    Champion Champion
    1996
    Steve Jones Steve Jones
    1996
    278 (-2) 278 (-2)
    1996
    ‹ 1995
    1997 ›
    ‹ 1995
    1997 ›
    Last edited by sandy666; Jun 18th, 2019 at 02:59 AM.
    I know you know but I forgot my Crystal Ball and don't know what you know



    In the first post, show the type of machine (PC / Mac) and the Office version you are working on
    impossible things we do on the spot. for miracles you need to wait for a while

  7. #7
    Board Regular sandy666's Avatar
    Join Date
    Oct 2015
    Posts
    2,335
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    1 Thread(s)

    Cool Re: Golf majors; retreive data from Wikipedia articles

    After posting I realized structure of the tables are different but as I can see you almost solved the problem
    You didn't say what you are tryin' to achieve so M-code is simply for each single year
    I used DV List to choose each year so it's a bit different from what you need
    I know you know but I forgot my Crystal Ball and don't know what you know



    In the first post, show the type of machine (PC / Mac) and the Office version you are working on
    impossible things we do on the spot. for miracles you need to wait for a while

  8. #8
    Board Regular sandy666's Avatar
    Join Date
    Oct 2015
    Posts
    2,335
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    1 Thread(s)

    Cool Re: Golf majors; retreive data from Wikipedia articles

    in single table for all years
    here is just breakthrough between 1986 and 1987

    1987 Tournament information Tournament information
    1987 Dates June 18–21, 1987
    1987 Location San Francisco, California
    1987 Course(s) Olympic Club, Lake Course
    1987 Organized by USGA
    1987 Tour(s) PGA Tour
    1987 Format Stroke play
    1987 Statistics Statistics
    1987 Par 70
    1987 Length 6,709 yards (6,135 m)
    1987 Field 156 players, 77 after cut
    1987 Cut 147 (+7)
    1987 Prize fund $825,000
    1987 Winner's share $150,000
    1987 Champion Champion
    1987 Scott Simpson Scott Simpson
    1987 277 (-3) 277 (-3)
    1987 ‹ 1986
    1988 ›
    ‹ 1986
    1988 ›
    1986 Tournament information Dates June 12–15, 1986
    1986 Tournament information Location Shinnecock Hills, New York
    1986 Tournament information Course(s) Shinnecock Hills Golf Club
    1986 Tournament information Organized by USGA
    1986 Tournament information Tour(s) PGA Tour
    1986 Statistics Par 70
    1986 Statistics Length 6,912 yards (6,320 m)
    1986 Statistics Field 155 players, 70 after cut
    1986 Statistics Cut 150 (+10)
    1986 Statistics Prize fund $700,000
    1986 Statistics Winner's share $115,000
    I know you know but I forgot my Crystal Ball and don't know what you know



    In the first post, show the type of machine (PC / Mac) and the Office version you are working on
    impossible things we do on the spot. for miracles you need to wait for a while

  9. #9
    Board Regular sandy666's Avatar
    Join Date
    Oct 2015
    Posts
    2,335
    Post Thanks / Like
    Mentioned
    15 Post(s)
    Tagged
    1 Thread(s)

    Cool Re: Golf majors; retreive data from Wikipedia articles

    and end result

    1988 Dates June 16–20, 1988
    1988 Location Brookline, Massachusetts
    1988 Course(s) The Country Club
    Composite Course
    1988 Tour(s) PGA Tour
    1988 Par 71
    1988 Length 7,010 yards (6,410 m)
    1988 Field 156 players, 65 after cut
    1988 Cut 146 (+4)
    1988 Prize fund $1.0 miilion
    1988 Winner's share $180,000
    1988 Curtis Strange Curtis Strange
    1988 278 (-6), playoff 278 (-6), playoff
    1988 ‹ 1987
    1989 ›
    ‹ 1987
    1989 ›
    1987 Dates June 18–21, 1987
    1987 Location San Francisco, California
    1987 Course(s) Olympic Club, Lake Course
    1987 Organized by USGA
    1987 Tour(s) PGA Tour
    1987 Format Stroke play
    1987 Par 70
    1987 Length 6,709 yards (6,135 m)
    1987 Field 156 players, 77 after cut
    1987 Cut 147 (+7)
    1987 Prize fund $825,000
    1987 Winner's share $150,000
    1987 Scott Simpson Scott Simpson
    1987 277 (-3) 277 (-3)
    1987 ‹ 1986
    1988 ›
    ‹ 1986
    1988 ›
    1986 Dates June 12–15, 1986
    1986 Location Shinnecock Hills, New York
    1986 Course(s) Shinnecock Hills Golf Club
    1986 Organized by USGA
    1986 Tour(s) PGA Tour
    1986 Par 70
    1986 Length 6,912 yards (6,320 m)
    1986 Field 155 players, 70 after cut
    1986 Cut 150 (+10)
    1986 Prize fund $700,000
    1986 Winner's share $115,000
    1985 Dates June 13–16, 1985
    1985 Location Bloomfield Hills, Michigan
    1985 Course(s) Oakland Hills Country Club, South Course
    1985 Tour(s) PGA Tour
    1985 Par 70
    1985 Length 6,996 yards (6,397 m)
    1985 Field 156 players, 66 after cut
    1985 Cut 146 (+6)
    1985 Prize fund $650,000
    1985 Winner's share $103,000
    I know you know but I forgot my Crystal Ball and don't know what you know



    In the first post, show the type of machine (PC / Mac) and the Office version you are working on
    impossible things we do on the spot. for miracles you need to wait for a while

  10. #10
    Board Regular DRSteele's Avatar
    Join Date
    Mar 2015
    Location
    Calgary
    Posts
    1,969
    Post Thanks / Like
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)

    Default Re: Golf majors; retreive data from Wikipedia articles

    Sandy, thanks for your dedication here. I was able to create a column under the field Select containing all the dates from 1860 onward. When the Query was invoked, some of the webpages reported an error becsue they don't exist, notably for the war years like 1917 and 1943. I then altered the M-code to reflect each of the four tournaments and to report the Dates only. As you noticed for the US_Open, there were years for each tournament when the Query stopped working and needed to have its columns adjusted. After these tinkerings, the Query reported a long column of Dates for each tournament and I now have 448 dates for the four major tournaments.

    Dealing with these dates is the new challenge. Here is a sample of my output. I will need to somehow create a start date and an end date for each tournament, which won't be easy because they are all laid out differently across and within each field.

    open usopen pga masters
    17 October 1860 October 4, 1895 October 10–14, 1916 March 22–25, 1934
    26 September 1861 July 18, 1896 September 16–20, 1919 April 4–8, 1935
    11 September 1862 September 17, 1897 August 17–21, 1920 April 3–6, 1936
    18 September 1863 June 17–18, 1898 August 14–18, 1922 April 1–4, 1937
    16 September 1864 September 14–15, 1899 September 24–29, 1923 April 2–4, 1938
    14 September 1865 October 4–5, 1900 September 15–20, 1924 March 31 – April 2, 1939
    13 September 1866 June 14–17, 1901 September 21–26, 1925 April 4–7, 1940
    26 September 1867 October 10–11, 1902 September 20–25, 1926 April 3–6, 1941
    23 September 1868 June 26–29, 1903 October 31 –

    November 5, 1927
    April 9–13, 1942
    16 September 1869 July 8–9, 1904 October 1–6, 1928 April 4–7, 1946
    15 September 1870 September 21–22, 1905 December 2–7, 1929 April 3–6, 1947
    13 September 1872 June 28–29, 1906 September 8–13, 1930 April 8–11, 1948
    4 October 1873 June 20–21, 1907 September 14–19, 1931 April 7–10, 1949
    10 April 1874 August 27–29, 1908 August 30 –

    September 4, 1932
    April 6–9, 1950
    10 September 1875 June 24–25, 1909 August 8–13, 1933 April 5–8, 1951
    30 September, 2 October 1876 June 17–20, 1910 July 24–29, 1934 April 3–6, 1952
    6 April 1877 June 23–26, 1911 October 17–23, 1935 April 9–12, 1953
    4 October 1878 August 1–2, 1912 May 24–30, 1937 April 8–12, 1954
    27, 29 September 1879 September 18–20, 1913 July 10–16, 1938 April 7–10, 1955
    9 April 1880 August 20–21, 1914 July 9–15, 1939 April 5–8, 1956
    Last edited by DRSteele; Jun 18th, 2019 at 03:14 PM.
    Windows10, Excel 365 Insider
    If you don't declare what version of Excel, we will assume
    Windows10, Excel 365.
    Formulas here are always in
    Green background
    You can get the HTML Maker: https://www.mrexcel.com/forum/excel-...ins-links.html.

Some videos you may like

User Tag List

Tags for this Thread

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •