Thanks Thanks:  0
Likes Likes:  0
Results 1 to 8 of 8

Thread: Quick Source Change?

  1. #1
    New Member
    Join Date
    Feb 2011
    Posts
    45
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Quick Source Change?


    Quick Source Change?
    Hey everybody! Been a while since I've been on here, and the first time on the Power BI section.

    Anyways, I've been dabbling with Power Query to pull NCAA Basketball data off of a stats site to mess around with. However, the way the site is set up, the columns I want to add to my large table are all at different URLs, so I've built like 50 queries all pointing to different URLs to pull the data off the tables at each one. I'm pretty new to Power Query, so I have two questions:

    1. Is there a better way to do that, i.e. can I build one query pointing to all the different sources and pull the data as I need it? Right now I have another query that merges all the individual ones as a solution.

    2. Now I want to pull data from another year, which is another 50-ish URLs, but I really would rather not brute-force it like the first time. So I guess this will be easier if the answer to Question 1 is, "Yep, and here's how you do it..." Originally I was thinking about writing some sort of script to change all the source URLs from the original queries.

    Any help is appreciated, let me know if I can provide any more helpful information.

  2. #2
    Board Regular scottsen's Avatar
    Join Date
    Mar 2014
    Location
    Seattle, WA
    Posts
    1,263
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Quick Source Change?

    So, the short answer is "Yes, you can do that by writing a function which pull 1 page... then calling the function multiple times". The devil will be in the details.

    You are about to write some custom M script, so... maybe go buy Ken Puls new book
    scott at tinylizard.com : Power Pivot Consultant, Trainer and Mercenary
    See my Power Pivot blog at http://tinylizard.com/blog

  3. #3
    New Member
    Join Date
    Feb 2011
    Posts
    45
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Quick Source Change?

    Quote Originally Posted by scottsen View Post
    So, the short answer is "Yes, you can do that by writing a function which pull 1 page... then calling the function multiple times". The devil will be in the details.

    You are about to write some custom M script, so... maybe go buy Ken Puls new book
    Ah, is that the, "M is for (Data) Monkey," book? Looks like they've got a deal on it on Amazon Prime. In the meantime, I'll keep working on it to see if I can make any headway. Thanks!

  4. #4
    New Member
    Join Date
    Feb 2011
    Posts
    45
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Quick Source Change?

    Found a great tutorial on how to do this:

    Iterating over multiple pages of web data using Power Query | Matt Masson

    I think this should quicken things up a LOT, but I'll still have to type in the individual page names when I build the first query. They're not just numerical as the example demonstrates. Oh well.

  5. #5
    Board Regular
    Join Date
    Feb 2015
    Location
    Berlin
    Posts
    544
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Quick Source Change?

    Normally I'd expect These URL to contain some strings that are the same and only some parts to be different. If you split them up in different columns in Excel and concatenate them (in Excel or even in Power Query, both using "&") it would be sufficient just to change the variable parts.
    __________________________
    Imke

    Blog: www.TheBIccountant.com

  6. #6
    New Member
    Join Date
    Feb 2011
    Posts
    45
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Quick Source Change?

    Quote Originally Posted by ImkeF View Post
    Normally I'd expect These URL to contain some strings that are the same and only some parts to be different. If you split them up in different columns in Excel and concatenate them (in Excel or even in Power Query, both using "&") it would be sufficient just to change the variable parts.
    It's proving a bit more difficult than the example I linked to earlier; in his example he's got all the columns he wants but just multiple pages of the same columns at different URLs. In my case, every page I go to should add another column to the table, with the Team name being the common one. So I kind of want to have something that does a buttload of merges based on what data it pulls from the sites.

    Granted, the example gave me a good place to start, I think, but I'm still not figuring out how to get M to do what I want with the list of URLs I've given it. I got it to pull the data correctly, now I just need it to put it in the right order. That part's harder :/

  7. #7
    New Member
    Join Date
    Feb 2011
    Posts
    45
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Quick Source Change?

    OK, I've figured out a way to do it, although it still has a bit of brute-force, "copy paste code," in the final data pull query.

    I created a function based on the one in the example, but added on a bunch of nested joins to merge all the sources together. I tried for a while to figure out how to get it to run through each one like the sources, but just gave up on it. Here's some code, any input is welcome:

    Code:
    (Page as text) as table =>
    let
        Source = Web.Page(Web.Contents("https://www.teamrankings.com/ncaa-basketball/stat/" & Page)),
        Data0 = Source{0}[Data],
        ColumnsNeeded = Table.SelectColumns(Data0,{"Team", "2013"}),
        Rename = Table.RenameColumns(ColumnsNeeded,{{"2013", Page}})
    in
        Rename
    This is my function I call over and over again in the next step, a combination of managing the page names and then joining all the tables together. This is where it gets clunky:

    Code:
    let
    Source = //list of all the pages I'm looking for...
    
    ToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    
    Data = Table.AddColumn(ToTable, "DataPull", each GetData([Column1])),
    
    Join1 = Table.NestedJoin(Data{0}[DataPull],{"Team"},Data{1}[DataPull],{"Team"},"NewCol",JoinKind.Inner),
        Expand1 =  Table.ExpandTableColumn(Join1, "NewCol", {Source{1}}, {Source{1}}),
    Join2 = Table.NestedJoin(Expand1,{"Team"},Data{2}[DataPull],{"Team"},"NewCol",JoinKind.Inner),
        Expand2 =  Table.ExpandTableColumn(Join2, "NewCol", {Source{2}}, {Source{2}}),
    
    //etc. etc. clunkiness...
    
    Join53 = Table.NestedJoin(Expand52,{"Team"},Data{53}[DataPull],{"Team"},"NewCol",JoinKind.Inner),
        Expand53 =  Table.ExpandTableColumn(Join53, "NewCol", {Source{53}}, {Source{53}}),
    
    RPIJoin1 = Table.NestedJoin(Expand53,{"Team"},FormatRPI{0}[Custom],{"Team"},"NewCol",JoinKind.Inner),
        RPIExpand1 = Table.ExpandTableColumn(RPIJoin1, "NewCol", {"rpi-rating-by-team"}, {"rpi-rating-by-team"}),
    RPIJoin2 = Table.NestedJoin(RPIExpand1,{"Team"},FormatRPI{1}[Custom],{"Team"},"NewCol",JoinKind.Inner),
        RPIExpand2 = Table.ExpandTableColumn(RPIJoin2, "NewCol", {"sos-rpi-rating-by-team"}, {"sos-rpi-rating-by-team"}),
    
    TrimPercent = //take the "%" off of the percentages...
    
    ToNumber = //large chunk for converting all the data to numbers...
    
    OrderColumns = //chunk for putting the columns where I want them
    
    in
        OrderColumns
    Yeah, 55 nested joins. There's another function I built called FormatRPI because the format of those tables is a little different that the other 53 attributes I'm looking for. It does the same thing as the GetData function, just has a couple more formatting steps in it. Anyways, now that it's built all I have to do is go back into the functions I'm calling and modify the URLs to point them towards whatever year I want to pull. Hooray!

  8. #8
    Board Regular
    Join Date
    Feb 2015
    Location
    Berlin
    Posts
    544
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Default Re: Quick Source Change?

    well done!
    From your description I would have suggested to do a pivot in order to create the additional columns - but really hard to tell without looking at the actual data.
    __________________________
    Imke

    Blog: www.TheBIccountant.com

Some videos you may like

User Tag List

Like this thread? Share it with others

Like this thread? Share it with others

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •