Theoretically is the key word, I'm sad to say. In practice, it's so buggy and poorly designed that you might save more time if you did what you wanted by hand.
Let's start with the simplest possible script:
What does this do?
$xl = New-Object -com Excel.Application
$wb = $xl.Workbooks.Open("C:\Temp\file.xls")
Each invocation of this script creates a copy of Excel which then just hangs around. Indefinitely. Like this:
If you just created the COM object, Excel would exit. But since you opened a workbook, it does not. Not even closing the PowerShell window exits these instances. You have to actually go and kill them using a task manager application.
Okay. Let's try modifying the script to close the workbook:
Does this work?
$xl = New-Object -com Excel.Application
$wb = $xl.Workbooks.Open("C:\Temp\file.xls")
$wb.Close($false)
Har har har. No:
But according to the docs, there's supposed to be a method named Close. Why is it not there? Hmmm.
Let's try this:
$xl = New-Object -com Excel.Application
$wb = $xl.Workbooks.Open("C:\Temp\file.xls")
[System.Threading.Thread]::Sleep(1000)
$wb.Close($false)
Yeah. We have to wait unspecified amounts of time in unspecified places. This increases the chances that the script will succeed.
This gives you an impression of the type of quality and robustness we are dealing with here.
But the Excel processes are still hanging around. What else do we need to do for them to quit?
Ah! Now the processes are cleaning up. It's like a magic incantation: we have to tell Excel three times for it to leave.
$xl = New-Object -com Excel.Application
$wb = $xl.Workbooks.Open("C:\Temp\file.xls")
[System.Threading.Thread]::Sleep(1000)
$wb.Close($false)
$xl.Workbooks.Close()
$xl.Quit()
Note that the
$false
parameter to Workbook.Close
is necessary. Otherwise, Excel opens an interactive dialog asking if we want to save changes to the unmodified file, opened by script.Non-performance, non-concurrency
I eventually got a script running that would process workbooks and extract information from specific columns. As I type this, the script is chugging along, at a leisurely pace of... 8 cells per second.It's 2015; we have multi-core computers running at GHz speeds. We have gigabytes of RAM, and solid-state hard drives. And the architecture of PowerShell, combined with the Excel COM interface, allows me to extract information out of workbooks at about the speed of a dot-matrix printer in 1990.
I saw in Process Explorer that the Excel instance spawned by the script is using nearly 100% of a single core. Computers have multiple cores, so I thought I'd tweak the script to run several concurrent instances, and get the job done 4x or 8x faster.
Nope. After launching 5 concurrent instances, instead of a single Excel process consuming 11% total CPU, I had five Excel processes consuming 2% total CPU each. So not only is it slow; but there's also a global lock somewhere that prevents concurrency.
This post does not yet have any comments.