2015 Year in Review

So back in January I set out a list of PowerShell goals for the year. It’s not over yet, but I thought I’d see how well I did on those goals.

1.  50 blog posts

  • I knew this one was ambitious, but I figured one post per week should be manageable. I’ve been close to that pace lately and should be able to hit this goal in 2016. Maybe I can get 25 in before the end of the year. 🙂

2.  New release of SQLPSX
3.  Separate release of ADOLIB

  • Didn’t exactly release these, but moved them to github, added POSH_ADO, and wrote about them.

4.  Second book (maybe in a different format, like Pluralsight?)
(if you missed it, my first book was released late last year here).

5. Teach 10 PowerShell classes at work

  • Taught 8, and recruited a second person to do beginning training

6. Work through the IIS and AD month of lunches books

  • I read part of the IIS book and have been able to use some of it at work. Didn’t get to AD

7. Build a virtualization lab at home and practice Hyper-V and VMWare

  • Built out virtual machines to do POSH_ADO testing and had a lot of fun. This will be on the list for next year as well

8. Do something cloudy (no idea what)

  • Wrote a small module to work with Keepass. Haven’t written about it yet.

Since there’s still some time left in the year (and I’m off work part of it), I may update this post or follow up with an update.

I appreciate everyone who reads my ramblings, and especially enjoy comments.

If you have any great ideas for PowerShell projects or topics you’d like me to write about, let me know in the comments.

–Mike

PowerShell Code Smell: Invoke-Expression (and some suggestions)

Code Smells

I’ve mentioned code smells in this blog before, but to recap, a code smell is a warning sign that you might be looking at bad code (for some value of bad).  Code smells are more like red flags than errors.  Some classic example of code smells are very long functions, lots of global variables, and goto statements.  There are places where all of these make sense, but in general if you see them you wonder if the code could use some work.

In most languages there is a way to take a string of characters and execute it as if it were code.  Those functions (or keywords, or methods) are generally considered to be risky, because you are giving up some control over what code is run.  If the input is somehow compromised, your program will have become an attack vector.  Not a good place to be.

SQL Injection

A classic example of this is building SQL statements from input and including parameters in the string.  If you don’t use SQL-language parameters, you are open to a SQL-injection attack where a malicious user puts characters in the input which cause the SQL statement you’re building to include instructions you didn’t intend to execute.  SQL-injection is a well-understood attack and the mitigation is also well-known.  Using real parameters instead of string building is the answer.

Back to PowerShell

The cmdlet which PowerShell includes to allow you to execute a string as code is Invoke-Expression.  It’s pretty simple to show that it’s vulnerable to an injection attack.  Consider the following code, where the intent is to write “hello ” followed by the value of a variable.

$prm1="world';get-date #"
invoke-expression "write-host 'Hello $prm1'"

You can see that the expression that is being invoked includes some quotes, but the input has been “crafted” to close the quotes early and to comment out the trailing quote.  This isn’t a sophisticated example, but it should help illustrate the problem.  An attacker, by the way, wouldn’t be satisfied with outputting the date either.

What to do instead?

I had been thinking about this for a while, and a few days ago someone wrote a comment on this post about it showing the problem but not giving solutions.  To be fair the article did talk about using the call operator (&) but it didn’t give a lot of details.  In the remainder of this post I will give some further guidance.  I don’t think that all Invoke-Expression calls can be eliminated, but this should go a long way towards a solution.

What if there are spaces in the command-line?

This one is covered in the article.  The call operator (&, also called the command invocation operator) can be used to execute commands that have spaces in them:

badPath

The call operator takes the name of the command, not an arbitrary string, so injection doesn’t work here:

callOperator

As noted in the help (about_operators), the call operator does not evaluate the string, so it can’t interpret arguments.  If you need to pass arguments to parameters, you can list them after the command like this:

PS C:\Bad Path> & "get-date" 10/10/10
Sunday, October 10, 2010 12:00:00 AM

You can also pass a command (the output of get-command) to the call operator, so get-command can be an easy way to “validate” the value that is being used. If get-command doesn’t return anything, it’s not a valid command.

What if I don’t know what the command needs to be?

This is a common question.  If I don’t know what command needs to be executed at the time I write the script, I can’t include it, right?  Actually, there’s no reason you have to know it beforehand.

$choice=read-host "Process or Service"
if ($choice -in 'process','service'){
    $cmd=get-command "get-$choice"
    & $cmd
} else {
    write-host 'You entered an invalid command'
}

What if I don’t know the argument values?

This one is simple.  Use variables.  This is bread-and-butter PowerShell.

What if I don’t know which parameters will be receiving arguments?

If you don’t know which parameters will be presented arguments, you might think you’re stuck.  You could have a bunch of if/elseif/else statements trying to check each combination of parameters and include different command-lines with each, but that’s obviously (I hope) not a maintainable solution.  A much better solution is to use splatting, a technique that lets you use a hashtable (or array) to supply the parameters and arguments for a command.

If the command is a PowerShell command (cmdlet, function, script), you can build a hashtable with the parameters (names) and arguments (values) you need to pass.  Using a hashtable in this way is very similar to the parameter solution for SQL-injection.  Because the values are bound to parameters by the engine rather than by evaluating a string, you don’t leave room for an attacker to compromise the command-line.  A simple splatting example might look like this:

$parms=@{}
if($IncludeSubdirectories){
   $parms['Recurse']=$true
}
if($path){
    $parms['Path']=$path
}
get-childitem @parms

Note that when we splat the hashtable (doesn’t that sound nice), we use the splat operator (@) instead of the dollar-sign ($) usually used with variables.

What if I don’t know the command or the parameters?

The call operator works fine with splatting.  Feel free to combine both techniques.  Proxy functions, for example, use the call operator (against the original command) with $PSBoundParameters to “forward” arguments to the original command.

 

Conclusion.

With the call operator and splatting in your toolbox, there should be a lot less occasions that you’re tempted to use Invoke-Expression.  PowerShell gives so many solid language features, it makes it easy to write good code.

What are your thoughts?  Can you think of circumstances that would still require Invoke-Expression?  Let me know in the comments.

 

–Mike

ISE Helpers module on Github

After reading the post here, I thought I should share the (considerably less complictated) functions I’ve written to help with the ISE.

I just posted a couple of functions to a new repo on Github called ISEHelpers. Neither function particularly exciting, but I’ve found them useful.

The first is called Edit-Module, and is used to open the .psm1 file of a module which you have imported in a new ISE tab.

For instance,

Edit-Module adolib

The second function is called Set-ISELocation, and it changes the current directory to the folder containing the file in the current tab.  It takes no parameters.

 

Have you written any “ISE Helper” functions?  Let me know about them in the comments.

 

–Mike

The Two Faces of the ISE Run Button (and a bit of bonus evil)

I love the ISE. I’ve used other “environments”, but always end up using the good old ISE. I do use the awesome ISESteroids module by Tobias Weltner (powertheshell.com), but most of the time you can find me in the unadorned, vanilla ISE.

With that bit of disclaimer out of the way, there is something that came to my attention recently. The Run button on the toolbar does two different things, although it doesn’t make a big deal about it. The two things are similar enough that it’s easy to miss, and subtle enough that the difference isn’t important most of the time.

The two things are, unsurprisingly, both concerned with running what’s in the current tab. Since it’s the Run button, you’d expect that to be the case.

Face Number 1
The first thing that the Run button does, is that it runs the code that’s in the current editor tab. It does this by copying the text as input down in the console area. An example is seen in the image below:
Screenshot_run_unsaved

You can clearly see that the text in the editor has been copied to the command-line.

Face Number 2

The second thing it does it it runs the script that’s loaded in the current tab. It doesn’t just run the script either, it actually dot-sources it (i.e. runs the script in the global scope).

The behavior of the Run button depends entirely on whether the tab has been saved as a script file (.ps1) before. If so, it runs (dot-sources) the script. If not, it executes the text that’s in the tab. Note in the first screenshot that the tab in the ISE says “Untitled.ps1”, which means it has not been saved. In the second, it says “RunButton.ps1”, so it obviously has been saved at that point.screenshot_run_saved

The great thing about this behavior is that you can run stuff without saving it. Once you decide to save it, though (perhaps because you want to debug it), the same button and hotkeys run the script in almost exactly the same way.

If you remember in my last post Blogging and Rubber Duck Debugging, I discussed how sometimes writing a blog post makes things more clear.  Fortunately I usually realize where my thinking has gone wrong before I hit “publish”, but not always.  This post, for instance, has sat in my drafts folder since October of 2014 because I wasn’t sure about it.

I was certain that I had a script which worked differently in the two “modes” of the Run button.  I remember vividly typing the (not very complex) script in my ISE and running it successfully.  I saved the file and gave it to someone else to run “for real”, and it failed.  I tracked the failure down to the fact that I was using scope modifiers (script: or global:) and they acted differently in an unsaved editor versus in a file.  I am unable to reproduce the result now, though, so I am doubting my sanity.  It does seem possible, though, that the script: scope in an actual script and in the global scope

NEWSBREAK!

Typing the above confession paragraph was enough to dislodge the bad thinking!  Rubber duck debugging to the rescue.

Here’s the simplified code that I started to blog about 13 months ago:

$values = 
$processed=@()
function ProcessValue{
Param($value)
 if($processed -contains $value){
    "$value already processed"
 } else {
    "Processing $value"
    $global:processed+=$value
 }
}
'Value1','Value2','Value3','Value1'| foreach {ProcessValue $_}

The code is pretty simple. It “processes” values as long as they haven’t already been “processed” by the function.

My expectation running the script (and example at the bottom) is that it would show that it processed the first three values and then reported that “value1” was already processed. Pretty simple, and that’s what it shows in the ISE when you click run.
screenshot_testscope_ISE.ps1

The problem isn’t in fact because the run button works differently if you’ve saved the file or not. The script failed when the other user ran it because he “ran” it. He didn’t load it into the ISE and click the Run button, he executed the script. The issue arises because dot-sourcing a script and running the script are not the same.

To illustrate, here’s what it looks like when you run the function:
screenshot_testscope_run.ps1

Notice that it failed to see that value1 had already been processed. Dot-sourcing the script works just like the Run button.
screenshot_testscope_dotsource.ps1

The “bug” in the script is that the first statement doesn’t include the scope modifier when it initializes the $processed variable. Since when the script is dot-sourced, that first instruction is already in the global scope, the variable is initialized as a list and it all works fine. When you run the script without dot-sourcing it, the initialization runs in the script: scope rather than the global scope and the line in the loop that is supposed to be adding the value to a list is instead concatenating the values as strings. Because of that, the -contains operator never returns true and everything gets processed every time. One more screenshot to confirm that:

screenshot_testscope_run_variable.ps1

Conclusion
So apparently the two faces of the Run button aren’t so bad. So what’s the bit in the title about “bonus evil”? One tiny problem with the Run button. When the run button dot-sources a file, it doesn’t use the dot-source syntax in the ISE. If you understand what’s going on, it’s not a big deal. If you don’t understand the difference (between running and dot-sourcing), you can end up beating your head against the wall trying to figure out what’s going on.

Postscript
I’m starting to think that using scope modifiers is a code smell. Not necessarily bad, but might point out that something could be done better.

Thanks for sticking with me on this “longer than usual” post. Let me know what you think in the comments!

-Mike

Blogging and Rubber Duck Debugging

Have you ever heard of Rubber Duck Debugging? The idea is simple. If you’re having trouble debugging code, just put a rubber duck on your desk and explain what’s happening in your code to the duck. Seems absurd, but the act of verbalizing the code situation is usually enough to break the log-jam in your mind and allow you to see the issue.

Another similar technique is “another set of eyes”. I can’t count the number of times I’ve asked someone to look at my code (or had someone ask me to look at theirs) only to find a really simple bug. “I’ve been looking at this for an hour!!!!” A different perspective is all it takes sometimes to spot the problem.

I’ve noticed more than a few times that I start to write a post about something that I think I understand. The more I write, however, the more I feel uncertain. By the three-quarter mark on the post, I save a draft and break out the ISE (or spin up a new virtual machine, or something) and find out that what I thought I knew well enough to share with the world I had completely wrong.

In that way, blogging is like using the entire world as a rubber duck or another set of eyes.

Just a thought I had (almost exactly a year ago) and finally got around to sharing.

Has this ever happened to you? I’d love to hear your stories in the comments.

–Mike

Why Adolib (and POSH_Ado)?

I’ve realized that in my explanations of Adolib and POSH_Ado, I left something important out. Why in the world am I spending all of this time and effort writing database access modules when there are already tools out there (SQLPS, for instance) which work.

The simple answer is SQLPS is not good enough for several reasons.

First, SQLPS is part of the SQL Server install, which is a big download. That’s quite a burden to place on a user just to get access to Invoke-SQLCmd.

Second, when I started writing Adolib (and the predecessor which is used at my company), SQLPS was still a snap-in rather than a module. This was in PowerShell 1.0 days, so it was the normal distribution method, but snap-ins were not fun to work with and that made SQLPS even more of a burden.

Third, although Invoke-SQLCmd has a lot of flexibility, it does not allow you to re-use the same connection for multiple commands. You connect (and authenticate) each time you want to run a SQL command. This seems wasteful to me.

Fourth, Invoke-SQLCmd uses strings for variable substitution rather than real parameters, so it’s vulnerable to SQL injection. While the other problems in this list can be overlooked, I have a harder time with this one. I realize that Invoke-SQLCmd is modeled to work like the command-line SQL tools, and that explains the string subsitution, there’s no good reason not to also support T-SQL parameters in statements.

Finally, the code in Adolib (and to some extent POSH_Ado) is pretty simple. It’s a good, easy to understand example of using .NET classes in PowerShell code. A friend at work who saw Adolib for the first time (reading this post) said that it seemed too easy. Adolib is very easy to use and easy enough to understand that you might find yourself adding features.

I work with SQL Server a lot, and most of the modules I use at work involve reading and or writing values to SQL. Adolib doesn’t have all of the flexibility that SQLPS gives, but it does use parameters and allows connection re-use. It’s been with me for a long time (8 years?) and the more I use it the more I can’t imagine using anything else.

POSH_Ado is a natural progression from Adolib. If you need to work with multiple database platforms, it’s really nice to have a consistent interface to work with them all. The times I’ve needed this kind of functionality POSH_Ado has been very handy and saved a lot of time.

Have you used Adolib or POSH_Ado? Anything you think needs to be added or changed with either?

I look forward to hearing your opinions.

–Mike

PowerShell and MySQL : POSH_Ado_MySQL

Using PowerShell and MySQL together with POSH_Ado is just as easy as SQL Server. You’ll need the POSH_Ado and POSH_Ado_MySQL Modules, and use this command to get started:

Import-Module POSH_Ado_MySQL

Once you’ve done that you’ll have the following functions at your disposal:

  • New-MySQLCommand
  • New-MySQLConnectionString
  • New-MySQLCommand
  • Invoke-MySQLCommand
  • Invoke-MySQLQuery
  • Invoke-MySQLStoredProcedure

These functions work just like the ones for SQLServer in AdoLib or POSH_Ado_SQLServer, except that they work with MySQL.

Inside POSH_Ado_MySQL, you’ll see that (just like POSH_Ado_SQLServer), it is simply importing the POSH_Ado module, specifying the MySQL ADO.NET provider name and the prefix (MySQL). Then, it calls the Set-MySQLADONetParameters function to add an option to the connection strings that are generated and to specify that there is no prefix for parameter names.

import-module POSH_Ado -args MySql.Data.MySqlClient -Prefix MySQL -force

# .NET (and PowerShell) do not like zero datetime values by default.  This option helps with that.
# http://dev.mysql.com/doc/refman/5.5/en/connector-net-connection-options.html
Set-MySQLADONetParameters -option @{'Allow Zero Datetime'='true'} -ParameterPrefix ''

Export-ModuleMember *-MySQL*

Hopefully by now you can see the power of the POSH_Ado project:

  1. All of the ADO.NET logic is in one place
  2. Consistent (but distinctly named) cmdlets for working with each platform
  3. Flexibility to set platform-specific options

There are currently 3 other platform-specific POSH_Ado modules: POSH_Ado_Oracle, POSH_Ado_Firebird, and POSH_Ado_DB2. It should be no trouble to create Postgres, SQLite, and OLEDB modules as well.

Do you have any projects where POSH_Ado could come in handy? What about other platforms to explore?

Let me know your thoughts in the comments.

-Mike

P.S. POSH_Ado and the platform-specific modules can be found here

Breaking the rules with helper functions

One of my most popular answers on StackOverflow is also one which has a tiny bit of controversy. It involves how to “hide” helper functions in a module in order to keep them from being exported.

Export-ModuleMember Details
In case you’re unfamiliar with how exporting functions from a module works, here are the basic rules:

  1. If there are no Export-ModuleMember statements, all function are exported
  2. If there are any Export-ModuleMember statements, only the functions named in an Export-ModuleMember statement are exported

In a similar question (which I answered the same way) a couple of other solutions are presented. Those solutions involve invoking the PSParser to find all of the functions and while technically correct, I think they miss the point of the question.

Why hide helper functions?
In the context of a PowerShell module, a helper function is simply a function which supports the functionality of the “public” functions in the module, but isn’t appropriate for use by the end-user. A helper function may implement common logic needed by several functions or possibly interact with implementation details in the module which are abstracted away from the user’s viewpoint. Exporting helper functions provides no benefit for the public, and in fact can cause confusion as these extra functions get in the way of understanding the focus of the module. Thus, it is important to be able to exclude these helper functions from the normal export from the module.

Why it’s hard to hide helper functions
First, it’s not actually hard to hide helper functions, it’s just tedious. All you have to do is list each non-helper function in an Export-ModuleMember statement. Unfortunately, that means if you have 100 functions with only one helper function, you need to list each of the 99 functions in order to hide the single helper function. Also, if you add a function later, you need to remember to add it to the list of exported functions. Not a good prize in my book. The PSParser solutions are correct in that they work, but they are a big block of code that obscures the intent.

My easy solution and the broken rule
My solution is to name helper functions with a VerbNoun convention rather than the standard Verb-Noun convention and use Export-ModuleMember *-* to export all functions named like PowerShell cmdlets are supposed to be. Using a different naming conventions is breaking an important rule in the PowerShell community and you’ll see in the comments about my original answer that someone called me out on it.

Why the rule exists (and why I don’t care that I broke it)
PowerShell was designed and delivered as a very discoverable system. That is, you can use PowerShell to find out stuff about PowerShell, and once you know some PowerShell you can leverage that knowledge to use even more PowerShell. The Verb-Noun convention clearly marks PowerShell cmdlets (functions, scripts) as distictive items, and the verbs are curated to help guide you to the same functionality in different arenas. For instance, my favorite example is the verb Stop. You could easily have used End, Terminate, Kill, or any number of other verbs in place of Stop, but because Stop is the approved verb you know it’s the one to use. Thus, when you start to look at services, you know it’s going to be Stop-Service. When you look at jobs, you know it will be Stop-Job.

By using Verb-Noun in your functions you make them fit nicely into the PowerShell ecosystem. Running into improperly named (either not following the convention or using unapproved verbs) is uncommon, and because of this things work nicely and everyone is happy.

Helper functions are not meant to be discoverable. They exist only in the private implementation of a module, and users never need to know that they exist, let alone try to figure out how to use them. For this reason, I don’t really mind breaking the rule.

I’d rather have this:

Export-ModuleMember *-*

Than this:

Add-Type -Path "${env:ProgramFiles(x86)}\Reference Assemblies\Microsoft\WindowsPowerShell\3.0\System.Management.Automation.dll"

Function Get-PSFunctionNames([string]$Path) {
    $ast = [System.Management.Automation.Language.Parser]::ParseFile($Path, [ref]$null, [ref]$null)
    $functionDefAsts = $ast.FindAll({ $args[0] -is [System.Management.Automation.Language.FunctionDefinitionAst] }, $true)
    $functionDefAsts | ForEach-Object { $_.Name }
}
Export-ModuleMember -Function ( (Get-PSFunctionNames $PSCommandPath) | Where { $_ -ne 'MyPrivateFunction' } )

or this:

$errors = $null 
$functions = [system.management.automation.psparser]::Tokenize($psISE.CurrentFile.Editor.Text, [ref]$errors) `
    | ?{(($_.Content -Eq "Function") -or ($_.Content -eq "Filter")) -and $_.Type -eq "Keyword" } `
    | Select-Object @{"Name"="FunctionName"; "Expression"={
        $psISE.CurrentFile.Editor.Select($_.StartLine,$_.EndColumn+1,$_.StartLine,$psISE.CurrentFile.Editor.GetLineLength($_.StartLine))
        $psISE.CurrentFile.Editor.SelectedText
    }
}
$functions | ?{ $_.FunctionName -ne "your-excluded-function" }

I’m really interested in feedback on this, since I’m coloring outside the lines. Do you favor practicality or do you think I should follow the rule.

Let me know your thoughts in the comments.

-Mike

POSH_Ado : Inside POSH_Ado_SQLServer

In a previous post I introduced the POSH_Ado “project” and explained that it is a way to use the same code-base to access several different database platforms. I illustrated it with some sample calls to a SQL Server database using the POSH_Ado_SQLServer module and promised to show the internals of the module later. The time has come. Here’s how POSH_Ado_SQLServer works:

import-module POSH_Ado -args System.Data.SqlClient -Prefix SQLServer -force

Export-ModuleMember *-SQLServer*

That’s it. The module simply imports the POSH_Ado module, telling it what ADO.NET provider to use (System.Data.SQLClient) and what prefix to use for the imported cmdlets (SQLServer). It then, in turn, exports all of the cmdlets with the SQLServer prefix.

With that tiny bit of effort you get:

  • SQL and NT authenticated connections
  • Parameterized queries and stored procedures
  • Input and output parameters (no in/out parameters yet, though)
  • Ad-hoc or stored connections

What’s missing in this list? I can think of a couple of things (which I need to enter as issues on GitHub):

  • In/out parameters
  • SQL BulkCopy (it’s there in Adolib…just need to copy it to POSH_Ado_SQLServer

Since the code for POSH_Ado is based on Adolib which targeted SQL Server, it shouldn’t be surprising to see that there’s not much to do to get POSH_Ado to work with SQL Server. In the next “episode”, I’ll connect to MySQL, and the real benefit of POSH_Ado should become apparent.

Let me know what you think in the comments!

-Mike

PowerShell List Assignment

PowerShell and lists of data go together hand-in-hand. Any time you execute a cmdlet, function, or script, the output is a list of objects which is placed in the output stream for processing.

Assigning a list to a variable is not very interesting, either. You just assign it and it’s done. Like this, for instance:

$files=dir c:\temp

Nothing to see here, we do this every time we use PowerShell. Lists on the right-hand side of the assignment operator are boring.

You might have even seen a trick for swapping variables using a comma on the left-hand side like this:

$a,$b=$b,$a

That’s kind of cool, but it seems like a pretty specific kind of thing to do. Fortunately for us, lists on the left-hand side can do more that this.

As an example, consider this line:

$a,$b=1,2,3

If you look at $a, you’ll see that it got the 1, and $b got 2 and 3.

We can expand the example:

$a,$b,$c=1,2,3,4,5,6

Now, $a gets 1, $b gets 2, and $c gets 3,4,5 and 6.

The pattern should be clear. Each variable on the left gets a single object, until the last one which gets all remaining objects. If we have more variables than values, the “extra” variables are $null. If you specify the same variable more than once, it keeps the last corresponding value.

Why is this useful?

Well, if you want to work with a collection but treat the first item specially, now you have an easy way to do that.

$first,$rest = <however you get your collection>
<process $first>
<process $rest>

Probably not something you’ll do all the time, but it’s another trick in the bag.

Do you have any scenarios where this would be helpful? Let me know in the comments.

-Mike