Piping is probably one of the most underutilized feature of Powershell that I’ve seen in the wild. Supporting pipes in Powershell allows you to write code that is much more expressive than simple imperative programming. However, most Powershell documentation does not do a good job of demonstrating how to think about pipable functions. In this tutorial, we will start with functions written the “standard” way and convert them step-by-step to support pipes.
Here’s a simple rule of thumb: if you find yourself writing a foreach
loop in Powershell with more than just a line or two in the body, you might be doing something wrong.
Consider the following output from a function called Get-Team
:
Name Value
---- -----
Chris Manager
Phillip Service Engineer
Andy Service Engineer
Neil Service Engineer
Kevin Service Engineer
Rick Software Engineer
Mark Software Engineer
Miguel Software Engineer
Stewart Software Engineer
Ophelia Software Engineer
Let’s say I want to output the name and title. I might write the Powershell as follows:
$data = Get-Team
foreach($item in $data) {
write-host "Name: $($item.Name); Title: $($item.Value)"
}
I could also use the Powershell ForEach-Object
function to do this instead of the foreach
block.
# % is a short-cut to ForEach-Object
Get-Team | %{
write-host "Name: $($_.Name); Title: $($_.Value)"
}
This is pretty clean given that the foreach
block is only one line. I’m going to ask you to use your imagination and pretend that our logic is more complex than that. In a situation like that I would prefer to write something that looks more like the following:
Get-Team | Format-TeamMember
But how do you write a function like Format-TeamMember
that can participate in the Piping behavior of Powershell? There is documenation about this, but it is often far from the introductory documentation and thus I have rarely seen it used by engineers in their day to day scripting in the real world.
The Naive Solution
Let’s start with the naive solution and evolve the function toward something more elegant.
Function Format-TeamMember() {
param([Parameter(Mandatory)] [array] $data)
$data | %{
write-host "Name: $($_.Name); Title: $($_.Value)"
}
}
# Usage
$data = Get-Team
Format-TeamMember -Data $Data
At this point the function is just a wrapper around the foreach
loop from above and thus adds very little value beyond isolating the foreach logic.
Let me draw your attention to the $data
parameter. It’s defined as an array
which is good since we’re going to pipe the array to a foreach
block. The first step toward supporting pipes in Powershell functions is to convert list parameters into their singular form.
Convert to Singular
Function Format-TeamMember() {
param([Parameter(Mandatory)] $item)
write-host "Name: $($item.Name); Title: $($item.Value)"
}
# Usage
Get-Team | %{
Format-TeamMember -Item $_
}
Now that we’ve converted Format-TeamMember
to work with single elements, we are ready to add support for piping.
Begin, Process, End
The powershell pipe functionality requires a little extra overhead to support. There are three blocks that must be defined in your function, and all of your executable code should be defined in one of those blocks.
Begin
fires when the first element in the pipe is processed (when the pipe opens.) Use this block to initialize the function with data that can be cached over the lifetime of the pipe.Process
fires once per element in the pipe.End
fires when the last element in the pipe is processed (or when the pipe closes.) Use this block to cleanup after the pipe executes.
Let’s add these blocks to Format-TeamMember
.
Function Format-TeamMember() {
param([Parameter(Mandatory)] $item)
Begin {
write-host "Format-TeamMember: Begin" -ForegroundColor Green
}
Process {
write-host "Name: $($item.Name); Title: $($item.Value)"
}
End {
write-host "Format-TeamMember: End" -ForegroundColor Green
}
}
# Usage
Get-Team | Format-TeamMember
#Output
cmdlet Format-TeamMember at command pipeline position 2
Supply values for the following parameters:
item:
Oh noes! Now Powershell is asking for manual input! No worries–There’s one more thing we need to do to support pipes.
ValueFromPipeLine… ByPropertyName
If you want data to be piped from one function into the next, you have to tell the receiving function which parameters will be received from the pipeline. You do this by means of two attributes: ValueFromPipeline
and ValueFromPipelineByPropertyName
.
ValueFromPipeline
The ValueFromPipeline
attribute tells the Powershell function that it will receive the whole value from the previous function in thie pipe.
Function Format-TeamMember() {
param([Parameter(Mandatory, ValueFromPipeline)] $item)
Begin {
write-host "Format-TeamMember: Begin" -ForegroundColor Green
}
Process {
write-host "Name: $($item.Name); Title: $($item.Value)"
}
End {
write-host "Format-TeamMember: End" -ForegroundColor Green
}
}
# Usage
Get-Team | Format-TeamMember
#Output
Format-TeamMember: Begin
Name: Chris; Title: Manager
Name: Phillip; Title: Service Engineer
Name: Andy; Title: Service Engineer
Name: Neil; Title: Service Engineer
Name: Kevin; Title: Service Engineer
Name: Rick; Title: Software Engineer
Name: Mark; Title: Software Engineer
Name: Miguel; Title: Software Engineer
Name: Stewart; Title: Software Engineer
Name: Ophelia; Title: Software Engineer
Format-TeamMember: End
ValueFromPipelineByPropertyName
This is great! We’ve really moved things forward! But we can do better.
Our Format-TeamMember
function now requires knowledge of the schema of the data from the calling function. The function is not self-contained in a way to make it maintainable or usable in other contexts. Instead of piping the whole object into the function, let’s pipe the discrete values the function depends on instead.
Function Format-TeamMember() {
param(
[Parameter(Mandatory, ValueFromPipelineByPropertyName)] [string] $Name,
[Parameter(Mandatory, ValueFromPipelineByPropertyName)] [string] $Value
)
Begin {
write-host "Format-TeamMember: Begin" -ForegroundColor Green
}
Process {
write-host "Name: $Name; Title: $Value"
}
End {
write-host "Format-TeamMember: End" -ForegroundColor Green
}
}
# Usage
Get-Team | Format-TeamMember
# Output
Format-TeamMember: Begin
Name: Chris; Title: Manager
Name: Phillip; Title: Service Engineer
Name: Andy; Title: Service Engineer
Name: Neil; Title: Service Engineer
Name: Kevin; Title: Service Engineer
Name: Rick; Title: Software Engineer
Name: Mark; Title: Software Engineer
Name: Miguel; Title: Software Engineer
Name: Stewart; Title: Software Engineer
Name: Ophelia; Title: Software Engineer
Format-TeamMember: End
Alias
In our last refactoring, we set out to make Format-TeamMember
self-contained. Our introduction of the Name
and Value
parameters decouple us from having to know the schema of the previous object in the pipeline–almost. We had to name our parameter Value
which is not really how Format-TeamMember
thinks of that value. It thinks of it as the Title
–but in the context of our contrived module, Value
is sometimes another name that is used. In Powershell, you can use the Alias
attribute to support multiple names for the same parameter.
Function Format-TeamMember() {
param(
[Parameter(Mandatory, ValueFromPipelineByPropertyName)] [string] $Name,
[Alias("Value")]
[Parameter(Mandatory, ValueFromPipelineByPropertyName)] [string] $Title # Change the name to Title
)
Begin {
write-host "Format-TeamMember: Begin" -ForegroundColor Green
}
Process {
write-host "Name: $Name; Title: $Title" # Use the newly renamed parameter
}
End {
write-host "Format-TeamMember: End" -ForegroundColor Green
}
}
# Usage
Get-Team | Format-TeamMember
# Output
Format-TeamMember: Begin
Name: Chris; Title: Manager
Name: Phillip; Title: Service Engineer
Name: Andy; Title: Service Engineer
Name: Neil; Title: Service Engineer
Name: Kevin; Title: Service Engineer
Name: Rick; Title: Software Engineer
Name: Mark; Title: Software Engineer
Name: Miguel; Title: Software Engineer
Name: Stewart; Title: Software Engineer
Name: Ophelia; Title: Software Engineer
Format-TeamMember: End
Pipe Forwarding
Our Format-TeamMember
function now supports receiving data from the pipe, but it does not return any information that can be forwarded to the next function in the pipeline. We can change that by returning
the formatted line instead of calling Write-Host
.
Function Format-TeamMember() {
param(
[Parameter(Mandatory, ValueFromPipelineByPropertyName)] [string] $Name,
[Alias("Value")]
[Parameter(Mandatory, ValueFromPipelineByPropertyName)] [string] $Title # Change the name to Title
)
Begin {
# Do one-time operations needed to support the pipe here
}
Process {
return "Name: $Name; Title: $Title" # Use the newly renamed parameter
}
End {
# Cleanup before the pipe closes here
}
}
# Usage
[array] $output = Get-Team | Format-TeamMember
write-host "The output contains $($output.Length) items:"
$output | Out-Host
# Output
The output contains 10 items:
Name: Chris; Title: Manager
Name: Phillip; Title: Service Engineer
Name: Andy; Title: Service Engineer
Name: Neil; Title: Service Engineer
Name: Kevin; Title: Service Engineer
Name: Rick; Title: Software Engineer
Name: Mark; Title: Software Engineer
Name: Miguel; Title: Software Engineer
Name: Stewart; Title: Software Engineer
Name: Ophelia; Title: Software Engineer
Filtering
This is a lot of information. What if we wanted to filter the data so that we only see the people with the title “Service Engineer?” Let’s implement a function that filters data out of the pipe.
function Find-Role(){
param(
[Parameter(Mandatory, ValueFromPipeline)] $item,
[switch] $ServiceEngineer
)
Begin {
}
Process {
if ($ServiceEngineer) {
if ($item.Value -eq "Service Engineer") {
return $item
}
}
if (-not $ServiceEngineer) {
# if no filter is requested then return everything.
return $item
}
return; # not technically required but shows the exit when nothing an item is filtered out.
}
End {
}
}
This should be self-explanatory for the most part. Let me draw your attention though to the return;
statement that isn’t technically required. A mistake I’ve seen made in this scenario is to return $null
. If you return $null
it adds $null
to the pipeline as it if were a return value. If you want to exclude an item from being forwarded through the pipe you must not return anything. While the return;
statement is not syntactically required by the language, I find it helpful to communicate my intention that I am deliberately not adding an element to the pipe.
Now let’s look at usage:
Get-Team | Find-Role | Format-Data # No Filter
Name: Chris; Title: Manager
Name: Phillip; Title: Service Engineer
Name: Andy; Title: Service Engineer
Name: Neil; Title: Service Engineer
Name: Kevin; Title: Service Engineer
Name: Rick; Title: Software Engineer
Name: Mark; Title: Software Engineer
Name: Miguel; Title: Software Engineer
Name: Stewart; Title: Software Engineer
Name: Ophelia; Title: Software Engineer
Get-Team | Find-Role -ServiceEngineer | Format-TeamMember # Filtered
Name: Phillip; Title: Service Engineer
Name: Andy; Title: Service Engineer
Name: Neil; Title: Service Engineer
Name: Kevin; Title: Service Engineer
Summary
Notice how clean the function composition is: Get-Team | Find-Role -ServiceEngineer | Format-TeamMember
!
Pipable functions are a powerful language feature of Powershell <rimshots/>
. Writing pipable functions allows you to compose logic in a way that is more expressive than simple imperative scripting. I hope this tutorial demonstrated to you how to modify existing Powershell functions to support pipes.
I was reading this stack overflow question: How can I solve this: Nhibernate Querying in an n-tier architecture?
The author is trying to abstract away NHibernate and is being counseled rather heavily not to do so. In the comments there are a couple of blog entries by Ayende on this topic:
The false myth of encapsulating data access in the DAL
Architecting in the pit of doom the evils of the repository abstraction layer
Ayende is pretty down on abstracting away NHIbernate. The answers on StackOverflow push the questioner toward just standing up an in-memory Sqlite instance and executing the tests against that.
The Sqlite solution is pretty painful with complex databases. It requires that you set up an enormous amount of data that isn’t really germane to your test in order to satisfy FK and other constraints. The ceremony of creating this extra data clutters the test and obscures the intent. To test a query for employees who are managers, I’d have to create Departments and Job Titles and Salary Types etc., etc., etc.. Dis-like.
What problem am I trying to solve?
In the .NET space developers tend to want to use LINQ to access, filter, and project data. NHibernate (partially) supports LINQ via an extension method off of ISession. Because ISession.Query<T> is an extension method, it is not stubbable with free mocking tools such as RhinoMocks, Moq, or my favorite: NSubstitute. This is why people push you to use the Sqlite solution—because the piece of the underlying interface that you want to use most of the time is not built for stubbing.
I think that a fundamental problem with NHibernate is that it is trying to serve 2 masters. On the one hand it wants to be a faithful port of Hibernate. On the other, it wants to be a good citizen for .NET. Since .NET has LINQ and Java doesn’t, the support for LINQ is shoddy and doesn’t really fit in well the rest of the API design. LINQ support is an “add-on” to the Java api, and not a first-class citizen. I think this is why it was implemented as an extension method instead of as part of the ISession interface.
I firmly disagree with Ayende on Generic Repository. However, I do agree with some of the criticisms he offers against specific implementations. I think his arguments are a little bit of straw man, however. It is possible to do Generic Repository well.
I prefer to keep my IRepository interface simple:
public interface IRepository : IDisposable { IQueryable<T> Find<T>() where T: class; T Get<T>(object key) where T : class; void Save<T>(T value) where T: class; void Delete<T>(T value) where T: class; ITransaction BeginTransaction(); IDbConnection GetUnderlyingConnection(); }
Here are some of my guidelines when using a Generic Repository abstraction:
- My purpose in using Generic Repository is not to “hide” the ORM, but
- to ease testability.
- to provide a common interface for accessing multiple types of databases (e.g., I have implemented IRepository against relational and non-relational databases) Most of my storage operations follow the Retrieve-Modify-Persist pattern, so Find<T>, Get<T>, and Save<T> support almost everything I need.
- I don’t expose my data models outside of self-contained operations, so Attach/Detach are not useful to me.
- If I need any of the other advanced ORM features, I’ll use the ORM directly and write an appropriate integration test for that functionality.
- I don’t use Attach/Detach, bulk operations, Flush, Futures, or any other advanced features of the ORM in my IRepository interface. I prefer an interface that is clean, simple, and useful in 95% of my scenarios.
- I implemented Find<T> as an IQueryable<T>. This makes it easy to use the Specification pattern to perform arbitrary queries. I wrote a specification package that targets LINQ for this purpose.
- In production code it is usually easy enough to append where-clauses to the exposed IQueryable<T>
- For dynamic user-driven queries I will write a class that will convert a flat request contract into the where-clause needed by the operation.
- I expose the underlying connection so that if someone needs to execute a sproc or raw sql there is a convenient way of doing that.
At Redacted Associates, we’ve been having a discussion about whether we should use the Generic Repository pattern on top of NHibernate. We have a simple IRepository interface as follows:
For my part, I like don’t like to spend a lot of time worrying about the way my ORM interacts with the database. I prefer to spend my design-energy around how to architect the application such that interacting with the database is a minor implementation detail, almost an afterthought even.
At least one of my co-workers disagrees, and has given a really good argument for a specific case when using direct features in NHibernate saved some work. This discussion has spurred me to ask myself “what are the most important features of an ORM?” and “at what level of abstraction should we work with an ORM?” There’s no way to answer these questions without identifying your purpose in using the ORM to begin with. With that in mind, I decided to categorize the features we all look for in an ORM and compare them to our Generic Repository implementation.
ORM features basically fall into one of 3 categories:
-
Queryability
-
Linq Provider
In .NET, Linq remains the most discoverable way to query an ORM. NHibernate has the QueryOver api, but I find it to be hopelessly unreadable anything but the simplest query.
-
Specification pattern
The easiest specification pattern in .NET relies on Linq. It’s a very nice way to allow api clients to construct their own queries without concerning themselves with database schema details. In an SOA architecture, it provides a flat contract to support complex query results. It minimizes the number of service methods because you don’t have to write GetCustomersByNameAndCityAndProduct.
-
Fully mapped relationships between objects.
I depend on having a fully expressed query model to use for linq queries. Getting the right data to work with often involves a number of database relationships and it’s impossible to predict when some new object or property will be needed to satisfy a query. It’s easiest to ensure that the model fully expresses the database and that all fields and relationships are present in the model. The model should fully and accurately express the database schema.
-
Id navigation properties.
Id navigation properties as a companion to the object relationship properties are really handy. They can reduce query-syntax clutter quite a bit. Employee.ManagerId is cleaner than Employee.Manager.Id. Some ORM’s will pull back the whole Manager to get the Id. I hate that.
-
Full support for all relationship types (one-to-one, one-to-many, many-to-many).
These relationships are standard in relational db’s. Any Object-Relational Mapper should support them.
-
Lazy Loading
-
-
Behavior
-
Cascade mappings.
This is not personally a value to me, but I recognize that in some cases it’s useful.
-
Trigger-behavior.
This sort of behavior is useful when you want the ORM to handle things like audit fields, soft deletes, or log table entries.
-
Sql-Efficiencies.
Sometimes pulling back large datasets and mapping them to in-memory objects can be very expensive. If performance is a concern, it’s nice to be able to have the ORM optimize the operation. NHibernate’s “Merge” operation is a good example of this.
-
-
Testability
-
In-memory testability
-
Mockable/Stubbable
-
I composed the following table listing the features we are seeking from an ORM and how each tool fares against our needs.
NHibernate |
Entity Framework |
Generic Repository |
Micro ORMs |
|
Linq Provider |
Not Fully Implemented | Fully Implemented | Depends on ORM | No |
Specification Pattern |
Easy to implement against partially implemented Linq provider. Hard otherwise. | Easy to implement. | Depends on ORM | No |
Can Fully Map Relationships |
Yes | Yes | Depends on ORM | No |
Id Navigation Properties |
Not without extreme (and not very useful) hacks | Yes | Depends on ORM | Yes |
Full support for relationship types |
One-to-one doesn’t work well. Results in N+1 queries on one side of the relationship | * Haven’t tested this. | Depends on ORM | Sort of |
Lazy Loading |
Yes | Yes | Depends on ORM | No |
Cascade Mappings |
Yes | Yes | Depends on ORM | No |
Trigger Behavior |
Yes | Yes | Depends on ORM | No |
Sql Efficiences |
Yes | Yes | Depends on ORM | No |
In-memory testability |
Yes, provided you use SqlLite. | Yes, provided you use SqlCompact edition—or you can interface your DbContext. | Yes | No |
Mockable-Stubbable |
Mostly. Some of the methods you use on ISession are in fact extension methods. .Query is an extension method which is problematic since that’s an obvious one I’d like to stub. | Mostly. Methods such as .Include() are extension methods with no in-memory counterpart. If I really need to use .Include() I’ll write an integration test instead. | Yes | No |
Notes |
|
|
|
Fantastic for quick and dirty CRUD on single tables. Not so great otherwise. |
Takeaways
My advice is to use Generic Repository except when you need to get close to the metal of your ORM for some fine-grained control over your data access.
I was reading some interesting blogs on the specification pattern today. I was hoping to find a NuGet package, but no such luck. So, I put together the best ideas from the blogs I read and created a new NuGet package called Isg.Specification. Much of this work is not original—my contribution is putting the work of these other developers together in an integrated way and publishing it on NuGet.
Although the write-up of the specification pattern on wikipedia doesn’t mention my specific usage goal, I want to create a parameter object that provides a specification to be used by a LINQ query in EntityFramework. In this way, I can provide a single Get<T>(ParameterObject specification) method to service all Get queries. This simplifies a service interface by removing all the various GetById, GetByName, GetByHasInvoices methods. All that differs between those methods is the predicate applied to the query, so passing the predicate into Get() as a parameter just feels natural. However, I don’t want the caller to be able to specify any predicate they want so using a parameter object that converts itself into a predicate services the purpose of limiting the callers query access as well as simplifying the query API.
Even though my target usage is EntityFramework, and the source code is in the same repository as Isg.EntityFramework, Isg.Specification in no way depends on EntityFramework.
The simplest form of the specification pattern uses simple boolean functions. However, since my target usage is EntityFramework, I needed the functions to be convertible to SQL by the EntityFramework. For this reason, my implementation of ISpecification<T> uses Expression<Func<T, bool>> instead of Func<T, bool> as its signature.
1: public interface ISpecification<T>
2: {
3: Expression<Func<T, bool>> IsSatisfied();
4: }
The simplest way to get started using the library is to inherit from CompositeSpecification<T>. Using the same sample DbContext I used for Isg.EntityFramework, here is a sample implementation of a Specification object against Customer:
1: public class CustomerSpecification : CompositeSpecification<Customer>
2: {
3: public string Name { get; set; }
4: public bool? HasInvoices { get; set; }
5:
6: protected override IEnumerable<Expression<Func<Customer, bool>>> GetExpressions()
7: {
8: if (!string.IsNullOrWhiteSpace(Name))
9: yield return ByName();
10:
11: if (HasInvoices.HasValue)
12: yield return ByHasInvoices();
13: }
14:
15: private Expression<Func<Customer, bool>> ByHasInvoices()
16: {
17: if (HasInvoices.Value)
18: return c => c.Invoices.Any();
19: return c => !c.Invoices.Any();
20: }
21:
22: private Expression<Func<Customer, bool>> ByName()
23: {
24: return c => c.Name.StartsWith(Name);
25: }
26: }
Usage is pretty easy. ISpecification<T> exposes a single method called IsSatisfied(). Feed the result of this method to a LINQ Where clause and you’re off to the races. CompositeSpecification<T> aggregates the expressions provided by GetExpressions() into a single And-ed expression and uses that for the filter.
In my test cases I create 2 customers, one of whom has invoices and one who does not. You can review the test cases in full here:
Here is sample usage:
1: [Test]
2: public void HasInvoices()
3: {
4: // Arrange
5: var filter = new CustomerSpecification()
6: {
7: HasInvoices = true,
8: };
9:
10: // Act
11: using (var context = new CustomerDbContext())
12: {
13: var query = context.Customers
14: .Where(filter.IsSatisfied())
15: .ToList();
16:
17: // Assert
18: Assert.That(query, Has.Count.EqualTo(1));
19: var result = query.Single();
20: Assert.That(result.Name, Is.EqualTo("Dagny Taggart"));
21: }
22: }
I just wanted to give some link-love to the blogs that were helpful to me in implementing the Specification Pattern. Stay tuned for Isg.Specifications on NuGet!
http://davedewinter.com/2009/05/31/linq-expression-trees-and-the-specification-pattern/
http://blogs.msdn.com/b/meek/archive/2008/05/02/linq-to-entities-combining-predicates.aspx
http://iainjmitchell.com/blog/?p=550
http://huyrua.wordpress.com/2010/07/13/entity-framework-4-poco-repository-and-specification-pattern/
http://devlicio.us/blogs/jeff_perrin/archive/2006/12/13/the-specification-pattern.aspx