Excel and Vba: using MSXML2.XMLHTTP to login into a website

Nelson78

Active Member
Joined
Sep 11, 2017
Messages
438
Hello everybody.

I'm a newbie by surfing the net via GET and POST request.

So far, I've had no many problems in scraping data from a link.


Code:
Sub getintoSITE()

    Dim URL As String, strResponse As String
    Dim objHTTP As Object

    URL = "............"
    
    Set objHTTP = CreateObject("MSXML2.XMLHTTP")

    With objHTTP
        .Open "GET", URL, False
        .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
        .send
        strResponse = .responseText
        
        Sheets(3).Range("A1") = strResponse
        
    End With

End Sub
Now, the following step is managing a barrier in terms of authentication.


In this link I have the form to insert login and password:

Code:
https://xxxxx/xxxxx/xxxxx/login.aspx
Then, if the operation is successful, I'm redirected to the desired link:

Code:
https://xxxxx/xxxxx/yyyyy.aspx

How can I face this issue?

I know I have to set a POST request to send the credentials. I also have Fiddler on my pc to parse cookies.

Anyway, my first doubt is: do I have to build the POST request referring to the link where the login form is set, or referring to the following page where I'm redirected?

I mean, something like this?

Code:
Sub getintoSITE()

    Dim URL As String, strResponse As String
    Dim objHTTP As Object

    URL_login = "https://xxxxx/xxxxx/xxxxx/login.aspx"
    URL_goal = "https://xxxxx/xxxxx/yyyyy.aspx"

 With CreateObject(“MSXML2.XMLHTTP”)
        .Open “post”, URL_login, False
        .setRequestHeader “Content-type”, “application/x-www-form-urlencoded”

...............................

Thank's in advance for your tips.
 
Last edited:

Some videos you may like

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"

Nelson78

Active Member
Joined
Sep 11, 2017
Messages
438
After some in-depth analyses, maybe the initial steps are as follows:

1) launch a get request in the login page in order to:
A - get session cookies
B - get the login framework
2) launch a post request to login using the 2 parameters A and B

Parsing via Fidler the get request in the login page:

Code:
HTTP/1.1 200 OK
Date: Mon, 24 Jun 2019 12:23:52 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Set-Cookie: ASP.NET_SessionId=blablabla123456; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=iso-8859-1
Content-Length: 13878
Set-Cookie: BIGipServerpool_xxx.yyyyyyyy.zz_http=123456789.12345.0000; path=/; Httponly; Secure
Vary: Accept-Encoding
Connection: Keep-Alive
About the form, I can see a framework as follows:

Code:
<form name="form" method="post" action="login.aspx" onsubmit="javascript:return submitform();" id="form">
<div>
<input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" />
<input type="hidden" name="ZZZ456" id="ZZZ456" value="" />
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUJLTQ5MTk5MDYzD2QWAgIDD2QWAgITDzwrAAQBAA8WCB4VRW5hYmxlRW1iZWRkZWRTY3JpcHRzZx4cRW5hYmxlRW1iZWRkZWRCYXNlU3R5bGVzaGVldGceElJlc29sdmVkUmVuZGVyTW9kZQspclRlbGVyaWsuV2ViLlVJLlJlbmRlck1vZGUsIFRlbGVyaWsuV2ViLlVJLCBWZXJzaW9uPTIwMTguMS4xMTcuMzUsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49MTIxZmFlNzgxNjViYTNkNAIeF0VuYWJsZUFqYXhTa2luUmVuZGVyaW5naGRkGAEFHl9fQ29udHJvbHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYBBQpidG5QcmltYXJ5ciW9RVsKlsB2m0RSfaD/1/R+Ulc=" />
</div>

******** type="text/javascript">

'... then stuff, in which 

function submitform()

'...other stuff

</form>
Now, I need some tips...
 
Last edited:

Nelson78

Active Member
Joined
Sep 11, 2017
Messages
438
Little by little, I'm going onward.

Scraping the parameters (__VIEWSTATE, __PREVIOUSPAGE, ...), all of them seem to have always the same value, except

Code:
__EVENTVALIDATION
I mean: if I scrape it twice in few minutes, it has the same value. If I scrape it once, and then scrape it after a hour, it changes.

Could it have something like an expiration?
 

Nelson78

Active Member
Joined
Sep 11, 2017
Messages
438
Little by little, I'm going onward.

Scraping the parameters (__VIEWSTATE, __PREVIOUSPAGE, ...), all of them seem to have always the same value, except

Code:
__EVENTVALIDATION
I mean: if I scrape it twice in few minutes, it has the same value. If I scrape it once, and then scrape it after a hour, it changes.

Could it have something like an expiration?
I could try answering myself: until the session is not expired (I think 10 minutes), the __EVENTVALIDATION keeps the same value.
 
Last edited:

Nelson78

Active Member
Joined
Sep 11, 2017
Messages
438
Some new steps has been done.

I have extrapolated the following values and built, with them, the -very long - string
Code:
strPostData
.

Here the values:
Code:
__VIEWSTATE
__LASTFOCUS
__EVENTARGUMENT
__EVENTTARGET
__VIEWSTATEGENERATOR
__PREVIOUSPAGE
__EVENTVALIDATION

Furthermore, two cookies are involved in the process:
Code:
Cookie 1 = "abc"
Cookie 2 = "def"

How can I manage the cookies in the post request (my unsuccessfull attempt below)?
Code:
Set reqHttp = CreateObject("MSXML2.XMLHTTP")
        reqHttp.Open "Post", URL, False
        reqHttp.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
        reqHttp.setRequestHeader "Cookie", cookie1
        reqHttp.setRequestHeader "Cookie", cookie2
        reqHttp.send (strPostData)
 
Last edited:

Watch MrExcel Video

Forum statistics

Threads
1,099,603
Messages
5,469,638
Members
406,661
Latest member
west5405

This Week's Hot Topics

Top