views:

153

answers:

4

I don't know if I worded it correctly, but for a simple example let's say we have a collection of Point3 values (say 1M).

We have a method called Offset that adds another Point3 value on these values, returning new Point3 values. Let's say the method is static.

The Point3 type is immutable.

The question is, should I have a method like this:

public static Point3 Offset ( Point3 a, Point3 b )

or

public static IEnumerable<Point3> Offset ( IEnumerable<Point3> a, IEnumerable<Point3> b )

To me #1 seems like a better choice to break the task into separate tasks for different threads.

What do you think? And advantages to #1 or #2?

+2  A: 

You should probably have the first, and have the second call the first.

mquander
+2  A: 

#1 seems simpler and cleaner, and you could always parallelize it from outside. I don't see a reason to use #2 exclusively, unless you've neglected to state a crucial detail. If you decide you want to routinely parallelize this sort of loop in the same way, make #2 call #1.

Avish
+1  A: 

My answer is both. I like having the simpiliest functions possible, so #1 is good. At the same time, convenience methods to operate on lists are darn useful, and can do the hard work of spawning threads if it's appropriate.

One of my beefs with Java (well, almost all languages, but Java is new enough they should have known better) is that they still haven't done a good job making the base library take advantages of multiple threads, or provided many mechanisms to help developers with that. There really should be a generic function to do "apply this function to all elements in this list", and have that function figure out how many cores are available, how big the list is, what the overhead is, and optimize accordingly.

Chris Arguin
+1  A: 

Option 1 is the logical core operation. With .NET 4.0 you can achieve the same operation as option 2 using the Zip operator. From memory, instead of:

var newPoints = Offset(firstPoints, secondPoints);

you'd write:

var newPoints = firstPoints.Zip(secondPoints, (p1, p2) => Offset(p1, p2));

You may want to consider making Offset an extension method on Point3 if you're using .NET 3.5 as well. (Alternatively, if you control the Point3 type, this sounds like a logical addition - it would be nice to write (p1, p2) => p1 + p2 in the call to Zip.

If you're not using .NET 4.0 but Zip appeals to you, we have an implementation in MoreLINQ - it's pretty simple.

So far, nothing has been related to multi-threading... now I don't know offhand whether there's a PLINQ implementation of Zip in .NET 4.0, but it would make sense for there to be one, IMO.

Jon Skeet
Thanks Jon. I agree. I also thought you could do something like this with #1:var points = Parallel.Foreach(origPoints, Offset) or something. I haven't used TPL so I don't know the full syntax.If you were to parallelize the offset operation for a collectiion of points using a single offset point3, how would you do it?
Joan Venge
Also you are right, offset is logical addition. I am making high level names for the users of the application where they could use it in the app. With high level it serves more than the + operator, in terms of integration with the app.
Joan Venge
Using a *single* offset is easy - it's just a case of using Select. You can parallelise that easily with PLINQ. (You're just projecting one sequence of points to another in a fixed way.) It's the "offset a sequence by another sequence" which is tricky.
Jon Skeet
Thanks Jon. It makes sense.
Joan Venge