Power Query ; imported text data with no spaces

nascaline

New Member
Joined
Apr 11, 2021
Messages
3
Office Version
  1. 365
Platform
  1. Windows
Hi all,

I am new to power query, I'm currently having a problem importing data from PDF files.( i.e. the imported text has skipped all the spacing and output one string)

What I get from import:
| Description
| PLASTERINGANDPAVING;Applyingwhitepaint; NipponPaintZeroTecPaint;
| green; NN1307-4 Gysum;asspecifiedindrawingnr.

My desired result:
| Description
| PLASTERING AND PAVING ; Applying white paint; Nippon Paint Zero Tec Paint;
| green; NN1307-4 Gysum ; as specified in drawing nr.

Is there any way that I can try to get the text data above?
Many Thanks

PDF file sample can be downloaded in the link below:
 

Excel Facts

Select a hidden cell
Somehide hide payroll data in column G? Press F5. Type G1. Enter. Look in formula bar while you arrow down through G.

Alex Blakenburg

Well-known Member
Joined
Feb 23, 2021
Messages
630
Office Version
  1. 365
Platform
  1. Windows
I had the same issue when I tried to import your file. I don't think you are going to like the only solution I could find.
The only way around it that I could find is to import the pdf into MS Word and save it as HTM, then import using Power Query import it as a Text file.
I got that method from here:- Import Tabular Data from PDF using Power Query - Excelerator BI - Sub heading "Import into Power Query".
They got the best result saving it as MHT but on your pdf I got a better result saving it as HTM.

If you can't get a better option and decide to use this follow the instructions on the Link I provided or if that is not clear let me know and I will provide more details.
 
Solution

nascaline

New Member
Joined
Apr 11, 2021
Messages
3
Office Version
  1. 365
Platform
  1. Windows
I had the same issue when I tried to import your file. I don't think you are going to like the only solution I could find.
The only way around it that I could find is to import the pdf into MS Word and save it as HTM, then import using Power Query import it as a Text file.
I got that method from here:- Import Tabular Data from PDF using Power Query - Excelerator BI - Sub heading "Import into Power Query".
They got the best result saving it as MHT but on your pdf I got a better result saving it as HTM.

If you can't get a better option and decide to use this follow the instructions on the Link I provided or if that is not clear let me know and I will provide more details.
Alex, Thanks for the help! Just tried it myself and the result seems fine for me. :)
 

Watch MrExcel Video

Forum statistics

Threads
1,130,381
Messages
5,641,816
Members
417,239
Latest member
AymericA

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Top