Extracting data from the Racing Post Website

white_lightning

New Member
Joined
Jun 7, 2011
Messages
22
Hi Guys

I am looking to extract basic info from the Racing Post Website. Stuff like horse form, opening prices etc

Just wondering if anyone could give me some pointers with either doing it my self (I have some programming knowledge but its basic) or getting someone else to program it for me.

Is this going to be ultra difficult to code myself?

Thanks
 

Excel Facts

How to show all formulas in Excel?
Press Ctrl+` to show all formulas. Press it again to toggle back to numbers. The grave accent is often under the tilde on US keyboards.
I would do a simple copy and paste all the data, then below it, extract out what you needed with formulas or code.
 
Upvote 0
I was looking for something that runs some code that will extract the necessary data into excel and then I could do some processing.
 
Upvote 0
@Ruddles It is not always possible to do a Web Query on the Racing Post site and transfer the information into Excel for a neat and tidy layout.

A number of users who access the site tend to resort to parsing the InnerHtml text to capture the data. Some of the web pages are set up with Javascript, which is another ball game.

However, though I couldn't find a thread on The Punter Lounge forum that I had referenced previously, I did find the following -

http://forum.punterslounge.com/f23/getting-data-rp-online-cards-29624/index15.html

which could get WhiteLightning started particularly with reference to Post #291

There are helpful pointers in the thread with regards to the areas of information that WhiteLightning requires.

hth
 
Last edited:
Upvote 0
thanks for the info guys.

as im a newbie, can you explain what parsing the html code means and my Java script is a whole different ballgame?

Do you mean some data will be available but some wont? Or is there a workround for any problems?

Thanks
 
Upvote 0
Can you post more details, including the URL for the site?

For example do you want the data for particular horses, meetings, races etc.

I've not had a look yet but this might be able to be done without having to parse HTML.

As the Javascript, that might not be problem.

Usually it's used to interact with the page/site and is triggered by things like clicking buttons.

It is possible to write code to click buttons etc so you don't really need to worry about the script itself.

Really need to see the site though.
 
Upvote 0
Last edited:
Upvote 0
Hi Norie

Thanks for the reply.

Im looking for basic details to start with, things like race name, time, runners and then some specific horse details like form, weight etc

the website is:

http://www.racingpost.com/horses2/cards/home.sd

this would be the main racing page and then you would have to navigate your way to each individual race etc.

I done a little research and sometime in 2009 the racing post changed it interface somewhat and i have heard that it makes data extraction from this site difficult. I have heard of things mentioned like AJAX and JAVASCRIPT. Not 100% sure what this means but obviosly more work required.

If anyone knows can they explain briefly the old (easy) way and now the more difficult way of extracting data to me in laymans terms just so i can visualise what it all means.

Thanks
 
Upvote 0
thanks for the info guys.

as im a newbie, can you explain what parsing the html code means and my Java script is a whole different ballgame?

Do you mean some data will be available but some wont? Or is there a workround for any problems?

Thanks

As an example the web page source is in Html code and the following shows the Heading for the first race down to the detail for the first horse.

<tdclass="raceTime"><ahref="http://www.racingpost.com/horses2/cards/card.sd?race_id=532525&r_date=2011-06-11"title="Click to view card: toteplacepot Win Without Backing A Winner Handicap">2:20</< font>a></< font>td>

<tdclass="raceTitle">

<p>

<strongclass="uppercase">

<ahref="http://www.racingpost.com/horses2/cards/card.sd?race_id=532525&r_date=2011-06-11"title="Click to view card: toteplacepot Win Without Backing A Winner Handicap">toteplacepot Win Without Backing A Winner Handicap</< font>a></< font>strong>

(CLASS 6)(4yo+ 0-55) Winner £2,184 13 runners <b>1m2f46y</< font>b>

Good <em> ATR </< font>em></< font>p>

<pclass="videLinks clearfix">

</< font>p>

<pclass="raceCondition"><b>Race Conditions: </< font>b>£3,200 guaranteed <b>For</< font>b> 4yo+ Rated 46-55 (also open to such horses Rated 45 and below) <b>Weights</< font>b> highest weight 9st 2lb <b>Minimum weight</< font>b> 8-7 <b>Penalties</< font>b> after June 4th, each race won 6lb <b>On The Cusp's Handicap Mark</< font>b> 58 <b>Entries</< font>b> 20 pay £ 16 <b>Penalty value 1st</< font>b>£2,183.68 <b>2nd</< font>b>£644.80 <b>3rd</< font>b>£322.56</< font>p>

</< font>td>

</< font>tr>

</< font>table>
<
tableclass="cardGridGl">

<tr>

<th>NO.</< font>th>

<th>FORM</< font>th>

<thcolspan="2">HORSE</< font>th>

<thclass="c">AGE</< font>th>

<thclass="r">WGT</< font>th>

<th>TRAINER RTF%</< font>th>

<th>JOCKEY</< font>th>

<thclass="r">OR</< font>th>

<thclass="r">TS</< font>th>

<thclass="r">RPR</< font>th>

</< font>tr>

<tr>

<td>

<b>1</< font>b><sup>11</< font>sup></< font>td>

<td>
<
b>7</< font>b><b>2</< font>b>0<b>2</< font>b>31 </< font>td>

<tdclass="rn">

</< font>td>

<tdclass="h">

<ahref="http://www.racingpost.com/horses/horse_home.sd?race_id=532525&r_date=2011-06-11&horse_id=728610"*******="return popup(this, {width:695, height:800})"title="Full details about this HORSE"><b>On The Cusp</< font>b></< font>a><suptitle="Days Since Run">3</< font>sup> 6xp <spantitle="Course Distance">

</< font>span>

</< font>td>

<tdclass="c">4</< font>td>

<tdclass="r">9-5</< font>td>

<td>

<ahref="http://www.racingpost.com/horses/trainer_home.sd?trainer_id=15660"*******="return popup(this, {width:695, height:800})"title="Full details about this TRAINER">Richard Guest</< font>a><sup><spantitle="the percentage of the stable's runners that have Run To Form in the last 14 days, based on RPR">39</< font>span></< font>sup></< font>td>

<td>

<ahref="http://www.racingpost.com/horses/jockey_home.sd?jockey_id=78520"*******="return popup(this, {width:695, height:800})"title="Full details about this JOCKEY">Robert L Butler</< font>a><sup>3</< font>sup></< font>td>

<tdclass="r">

58 </< font>td>

<tdclass="r">

68 </< font>td>

<tdclass="r">

73 </< font>td>

</< font>tr>

<tr>

<td>


Parsing is what you would do programmatically in VBA to extract the Race distance, Race Class, number of runners...etc.

Although, some parts of the data are presented on the web page using Javascript it is possible to bring that information into Excel.

hth


 
Upvote 0

Forum statistics

Threads
1,224,558
Messages
6,179,512
Members
452,921
Latest member
BBQKING

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top