Quick Source Change?

MickeyPvX

New Member
Joined
Feb 14, 2011
Messages
45
Hey everybody! Been a while since I've been on here, and the first time on the Power BI section.

Anyways, I've been dabbling with Power Query to pull NCAA Basketball data off of a stats site to mess around with. However, the way the site is set up, the columns I want to add to my large table are all at different URLs, so I've built like 50 queries all pointing to different URLs to pull the data off the tables at each one. I'm pretty new to Power Query, so I have two questions:

1. Is there a better way to do that, i.e. can I build one query pointing to all the different sources and pull the data as I need it? Right now I have another query that merges all the individual ones as a solution.

2. Now I want to pull data from another year, which is another 50-ish URLs, but I really would rather not brute-force it like the first time. So I guess this will be easier if the answer to Question 1 is, "Yep, and here's how you do it..." Originally I was thinking about writing some sort of script to change all the source URLs from the original queries.

Any help is appreciated, let me know if I can provide any more helpful information.
 

Excel Facts

Difference between two dates
Secret function! Use =DATEDIF(A2,B2,"Y")&" years"&=DATEDIF(A2,B2,"YM")&" months"&=DATEDIF(A2,B2,"MD")&" days"
So, the short answer is "Yes, you can do that by writing a function which pull 1 page... then calling the function multiple times". The devil will be in the details.

You are about to write some custom M script, so... maybe go buy Ken Puls new book :)
 
Upvote 0
So, the short answer is "Yes, you can do that by writing a function which pull 1 page... then calling the function multiple times". The devil will be in the details.

You are about to write some custom M script, so... maybe go buy Ken Puls new book :)

Ah, is that the, "M is for (Data) Monkey," book? Looks like they've got a deal on it on Amazon Prime. In the meantime, I'll keep working on it to see if I can make any headway. Thanks!
 
Upvote 0
Normally I'd expect These URL to contain some strings that are the same and only some parts to be different. If you split them up in different columns in Excel and concatenate them (in Excel or even in Power Query, both using "&") it would be sufficient just to change the variable parts.
 
Upvote 0
Normally I'd expect These URL to contain some strings that are the same and only some parts to be different. If you split them up in different columns in Excel and concatenate them (in Excel or even in Power Query, both using "&") it would be sufficient just to change the variable parts.

It's proving a bit more difficult than the example I linked to earlier; in his example he's got all the columns he wants but just multiple pages of the same columns at different URLs. In my case, every page I go to should add another column to the table, with the Team name being the common one. So I kind of want to have something that does a buttload of merges based on what data it pulls from the sites.

Granted, the example gave me a good place to start, I think, but I'm still not figuring out how to get M to do what I want with the list of URLs I've given it. I got it to pull the data correctly, now I just need it to put it in the right order. That part's harder :/
 
Upvote 0
OK, I've figured out a way to do it, although it still has a bit of brute-force, "copy paste code," in the final data pull query.

I created a function based on the one in the example, but added on a bunch of nested joins to merge all the sources together. I tried for a while to figure out how to get it to run through each one like the sources, but just gave up on it. Here's some code, any input is welcome:

Code:
(Page as text) as table =>
let
    Source = Web.Page(Web.Contents("https://www.teamrankings.com/ncaa-basketball/stat/" & Page)),
    Data0 = Source{0}[Data],
    ColumnsNeeded = Table.SelectColumns(Data0,{"Team", "2013"}),
    Rename = Table.RenameColumns(ColumnsNeeded,{{"2013", Page}})
in
    Rename

This is my function I call over and over again in the next step, a combination of managing the page names and then joining all the tables together. This is where it gets clunky:

Code:
let
Source = //list of all the pages I'm looking for...

ToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),

Data = Table.AddColumn(ToTable, "DataPull", each GetData([Column1])),

Join1 = Table.NestedJoin(Data{0}[DataPull],{"Team"},Data{1}[DataPull],{"Team"},"NewCol",JoinKind.Inner),
    Expand1 =  Table.ExpandTableColumn(Join1, "NewCol", {Source{1}}, {Source{1}}),
Join2 = Table.NestedJoin(Expand1,{"Team"},Data{2}[DataPull],{"Team"},"NewCol",JoinKind.Inner),
    Expand2 =  Table.ExpandTableColumn(Join2, "NewCol", {Source{2}}, {Source{2}}),

//etc. etc. clunkiness...

Join53 = Table.NestedJoin(Expand52,{"Team"},Data{53}[DataPull],{"Team"},"NewCol",JoinKind.Inner),
    Expand53 =  Table.ExpandTableColumn(Join53, "NewCol", {Source{53}}, {Source{53}}),

RPIJoin1 = Table.NestedJoin(Expand53,{"Team"},FormatRPI{0}[Custom],{"Team"},"NewCol",JoinKind.Inner),
    RPIExpand1 = Table.ExpandTableColumn(RPIJoin1, "NewCol", {"rpi-rating-by-team"}, {"rpi-rating-by-team"}),
RPIJoin2 = Table.NestedJoin(RPIExpand1,{"Team"},FormatRPI{1}[Custom],{"Team"},"NewCol",JoinKind.Inner),
    RPIExpand2 = Table.ExpandTableColumn(RPIJoin2, "NewCol", {"sos-rpi-rating-by-team"}, {"sos-rpi-rating-by-team"}),

TrimPercent = //take the "%" off of the percentages...

ToNumber = //large chunk for converting all the data to numbers...

OrderColumns = //chunk for putting the columns where I want them

in
    OrderColumns

Yeah, 55 nested joins. There's another function I built called FormatRPI because the format of those tables is a little different that the other 53 attributes I'm looking for. It does the same thing as the GetData function, just has a couple more formatting steps in it. Anyways, now that it's built all I have to do is go back into the functions I'm calling and modify the URLs to point them towards whatever year I want to pull. Hooray!
 
Upvote 0
well done!
From your description I would have suggested to do a pivot in order to create the additional columns - but really hard to tell without looking at the actual data.
 
Upvote 0

Forum statistics

Threads
1,214,384
Messages
6,119,201
Members
448,874
Latest member
Lancelots

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top