Copy/extract all the words from word file and save it in separate excel column based on Alphabetical letter

thomsonreuters

New Member
Joined
Dec 9, 2017
Messages
26
Hi Experts,

I want to extract all the words from word file and paste in to the different excel column based Alphabetical letter using VBA.
Please could anyone help on this ....I had tried to fine through online but could not make it....it's very important for me

For example, if we assume above sentences present in the word file then my exception would be..

ABCDE
Allbasedcolumnword start with Dword start with E
Alphabeticalbutcould

Hope its clear

Thanks
JP
 

Excel Facts

Links? Where??
If Excel says you have links but you can't find them, go to Formulas, Name Manager. Look for old links to dead workbooks & delete.
Try this
Before running the macro, create 2 sheets, Sheet1 and Sheet2, the results on Sheet1.
Change in this line of the macro the name of your word document.
Set wrdDoc = wrdApp.Documents.Open("C:\trabajo\test.docx")

VBA Code:
Sub words_Alfba()
  Dim wrdApp As Word.Application, wrdDoc As Word.Document
  Dim sh2 As Worksheet, dic As Object
  Dim a As Variant, b As Variant, c As Variant, col As Variant
  Dim ltr As String, i As Long, n As Long
  
  Set sh2 = Sheets("Sheet2")
  sh2.Cells.ClearContents
  
  Set wrdApp = CreateObject("Word.Application")
  wrdApp.Visible = True
  Set wrdDoc = wrdApp.Documents.Open("C:\trabajo\test.docx")
  wrdDoc.Range.Copy
  sh2.Select
  sh2.Range("A1").Select
  sh2.Paste
  wrdApp.Quit
    
  Set dic = CreateObject("Scripting.Dictionary")
  For i = 1 To 28
    dic(i) = 1
  Next
  
  a = sh2.Range("A1", sh2.Range("A" & Rows.Count).End(3)).Value2
  n = Evaluate("=MAX(len(A1:A" & UBound(a) & ")-LEN(SUBSTITUTE(A1:A" & UBound(a) & ","" "","""")))")
  ReDim b(1 To n * UBound(a), 1 To 28)
  For i = 1 To UBound(a)
    For Each c In Split(a(i, 1), " ")
      If c <> "" Then
        ltr = UCase(Left(c, 1))
        If ltr Like "*[A-Z]*" Then
          col = Asc(ltr) - 64
        Else
          col = 28
        End If
        b(dic(col), col) = c
        dic(col) = dic(col) + 1
      End If
    Next
  Next
  
  With Sheets("Sheet1")
    .Cells.ClearContents
    .Range("A1").Resize(UBound(b), 28).Value = b
    .Select
  End With
End Sub
 
Upvote 0
I cant forget this help. Thank you so much. it's made my job easy.
 
Upvote 0
Hi Dante Amor ,

Sorry to disturb you again.....

if my source document is "PDF", not "word" then what kind of changes i should do in the coding part..... please could you help on this.....

I should learn VBA, it's looks fantastic....make job very easy...

Thanks
JP
 
Upvote 0
if my source document is "PDF", not "word" then what kind of changes i should do in the coding part..... please could you help on this.....

Unfortunately I cannot test with PDF.
You could use part of the macro if you manually copy all the information from the PDF and put it on sheet2.
Try this macro, the results on sheet1

VBA Code:
Sub words_PDF()
  Dim wrdApp As Word.Application, wrdDoc As Word.Document
  Dim sh2 As Worksheet, dic As Object
  Dim a As Variant, b As Variant, c As Variant, col As Variant
  Dim ltr As String, i As Long, n As Long, lr As Long, lc As Long
  
  Set sh2 = Sheets("Sheet2")
  sh2.Cells.ClearContents
      
  Set dic = CreateObject("Scripting.Dictionary")
  For i = 1 To 28
    dic(i) = 1
  Next
  
  lr = sh2.Cells.Find("*", , xlValues, , xlByRows, xlPrevious).Row
  lc = sh2.Cells.Find("*", , xlValues, , xlByColumns, xlPrevious).Column
  a = sh2.Range("A1", sh2.Cells(lr, lc)).Value2
  n = Evaluate("=MAX(len(A1:A" & UBound(a) & ")-LEN(SUBSTITUTE(A1:A" & UBound(a) & ","" "","""")))")
  ReDim b(1 To n * UBound(a), 1 To 28)
  For i = 1 To UBound(a)
    For Each c In Split(a(i, 1), " ")
      If c <> "" Then
        ltr = UCase(Left(c, 1))
        If ltr Like "*[A-Z]*" Then
          col = Asc(ltr) - 64
        Else
          col = 28
        End If
        b(dic(col), col) = c
        dic(col) = dic(col) + 1
      End If
    Next
  Next
  
  With Sheets("Sheet1")
    .Cells.ClearContents
    .Range("A1").Resize(UBound(b), 28).Value = b
    .Select
  End With
End Sub
 
Upvote 0
Try this
Before running the macro, create 2 sheets, Sheet1 and Sheet2, the results on Sheet1.
Change in this line of the macro the name of your word document.
Set wrdDoc = wrdApp.Documents.Open("C:\trabajo\test.docx")

VBA Code:
Sub words_Alfba()
  Dim wrdApp As Word.Application, wrdDoc As Word.Document
  Dim sh2 As Worksheet, dic As Object
  Dim a As Variant, b As Variant, c As Variant, col As Variant
  Dim ltr As String, i As Long, n As Long
 
  Set sh2 = Sheets("Sheet2")
  sh2.Cells.ClearContents
 
  Set wrdApp = CreateObject("Word.Application")
  wrdApp.Visible = True
  Set wrdDoc = wrdApp.Documents.Open("C:\trabajo\test.docx")
  wrdDoc.Range.Copy
  sh2.Select
  sh2.Range("A1").Select
  sh2.Paste
  wrdApp.Quit
   
  Set dic = CreateObject("Scripting.Dictionary")
  For i = 1 To 28
    dic(i) = 1
  Next
 
  a = sh2.Range("A1", sh2.Range("A" & Rows.Count).End(3)).Value2
  n = Evaluate("=MAX(len(A1:A" & UBound(a) & ")-LEN(SUBSTITUTE(A1:A" & UBound(a) & ","" "","""")))")
  ReDim b(1 To n * UBound(a), 1 To 28)
  For i = 1 To UBound(a)
    For Each c In Split(a(i, 1), " ")
      If c <> "" Then
        ltr = UCase(Left(c, 1))
        If ltr Like "*[A-Z]*" Then
          col = Asc(ltr) - 64
        Else
          col = 28
        End If
        b(dic(col), col) = c
        dic(col) = dic(col) + 1
      End If
    Next
  Next
 
  With Sheets("Sheet1")
    .Cells.ClearContents
    .Range("A1").Resize(UBound(b), 28).Value = b
    .Select
  End With
End Sub

=================
HI Dante Amor ,

when i use word document contains more number of page like more than 500 pages then its error "1004"

Error shows in below line..

.Range("A1").Resize(UBound(b), 28).Value = b


Plesae COuld you help on this


thanks
JP
 
Upvote 0
Unfortunately I cannot test with PDF.
You could use part of the macro if you manually copy all the information from the PDF and put it on sheet2.
Try this macro, the results on sheet1

VBA Code:
Sub words_PDF()
  Dim wrdApp As Word.Application, wrdDoc As Word.Document
  Dim sh2 As Worksheet, dic As Object
  Dim a As Variant, b As Variant, c As Variant, col As Variant
  Dim ltr As String, i As Long, n As Long, lr As Long, lc As Long
 
  Set sh2 = Sheets("Sheet2")
  sh2.Cells.ClearContents
     
  Set dic = CreateObject("Scripting.Dictionary")
  For i = 1 To 28
    dic(i) = 1
  Next
 
  lr = sh2.Cells.Find("*", , xlValues, , xlByRows, xlPrevious).Row
  lc = sh2.Cells.Find("*", , xlValues, , xlByColumns, xlPrevious).Column
  a = sh2.Range("A1", sh2.Cells(lr, lc)).Value2
  n = Evaluate("=MAX(len(A1:A" & UBound(a) & ")-LEN(SUBSTITUTE(A1:A" & UBound(a) & ","" "","""")))")
  ReDim b(1 To n * UBound(a), 1 To 28)
  For i = 1 To UBound(a)
    For Each c In Split(a(i, 1), " ")
      If c <> "" Then
        ltr = UCase(Left(c, 1))
        If ltr Like "*[A-Z]*" Then
          col = Asc(ltr) - 64
        Else
          col = 28
        End If
        b(dic(col), col) = c
        dic(col) = dic(col) + 1
      End If
    Next
  Next
 
  With Sheets("Sheet1")
    .Cells.ClearContents
    .Range("A1").Resize(UBound(b), 28).Value = b
    .Select
  End With
End Sub
Thanks. Will check and update you.

thanks
JP
 
Upvote 0
when i use word document contains more number of page like more than 500 pages then its error "1004"
What does the error message say?
Mouse over this UBound(b) , a number should appear, what number it appear or what data appears.
 
Upvote 0
What does the error message say?
Mouse over this UBound(b) , a number should appear, what number it appear or what data appears.
Hi,

Error as below

1590504309106.png


what number it appear? - 1996140 shows when i place the mouse over Ubound(b)

thanks
JP
 
Upvote 0

Forum statistics

Threads
1,214,800
Messages
6,121,641
Members
449,044
Latest member
hherna01

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top