views:

203

answers:

5

For or While loops in Mathematica code always make me feel a little dirty but I was confusing myself trying to do some list munging all functional-like, and resorted to this:

(* # Given a list of {x,y} pairs, transform the data as follows: every time 
   # there's a decrease in y-value from one datapoint to the next, say {x1,Y} 
   # followed by {x2,y}, add Y to the value of every datapoint on or after x2. *)
monotonify[data_] := Module[{data0, i, offset = 0},
  data0 = data;
  For[i = 2, i <= Length[data], i++,
    If[data[[i-1,2]] > data[[i,2]], offset += data[[i-1,2]]];
    data0[[i]] += {0,offset}];
  data0]

(Think of the y-values as odometer readings where sometimes the odometer gets accidentally reset -- evident because the value decreases, which odometers shouldn't do. So we transform the readings by adding the last known value before each reset to all future values.)

How would you write monotonify in a nice functional style?

(The fact that I don't consider the above For loop perfectly fine is probably a mild form of OCD.)

+2  A: 

Once the gauntlet was down I couldn't not try it, but I kind of think the For loop version is more straightforward:

mon00[{prev_,offset_}, next_] := {next, offset + If[prev > next, prev, 0]}
monotonify0[list_] := list + Rest[FoldList[mon00, {-Infinity,0}, list]][[All,2]]
monotonify[data_] := Transpose@{#1, monotonify0[#2]}& @@ Transpose@data

The idea is to write a helper function that does it for just a plain list of the y-values and then use the double-Transpose idiom to operate on just the second column of the data.

Handy reference for the double-Transpose idiom

For transforming a particular column in a matrix, eg, replacing each value x in column 2 of a 4-column matrix with transformElement[x]:

{#1, transformElement[#2], #3, #4}& @@@ matrix

If you need to transform a column with a function that takes the whole column as a list, use the following idiom:

Transpose @ {#1, transformList[#2], #3, #4}& @@ Tranpose@matrix
dreeves
Ooh, turns out this version is 3 times faster than the For loop version.
dreeves
+4  A: 

OK, now I've fixed my approach to work with inputs as originally requested.

Start with a sample dataset:

dataset = {{a, 1}, {b, 2}, {c, 3}, {d, 4}, {e, 5}, {f, 0}, {g, 4}, 
{h,5}, {i, 6}, {j, 7}, {k, 4}, {l, 7}, {m, 8}, {n, 9}, {o, 0}, {p,2}, 
{q, 3}};

Take the transpose:

trDataset = Transpose[dataset];

next a function to operate on the Y-values only:

trDataset[[2]] = FoldList[Plus, dataset[[1, 2]], Map[Max[#, 0] &, Differences[dataset[[All, 2]]]]]

Undo the transposition:

dataset = Transpose[trDataset]

and the output is now

{{a, 1}, {b, 2}, {c, 3}, {d, 4}, {e, 5}, {f, 5}, {g, 9}, {h, 10}, {i, 
  11}, {j, 12}, {k, 12}, {l, 15}, {m, 16}, {n, 17}, {o, 17}, {p, 
  19}, {q, 20}}

I still haven't tested the performance of this solution.

EDIT: OK, here's the basis of a fix, I'll leave the rest of the work to you @dreeves. This version of monotonify only works on a list of numbers, I haven't integrated it into my previous suggestion to work with your inputs.

monotonify[series_] := 
 Split[series, Less] //. {a___, x_List, y_List, z___} /; 
     Last[x] > First[y] -> {a, x, y + Last[x], z} // Flatten

EDIT 2: Another function which works on a list of numbers. This is much faster than my previous attempt.

monotonify[series_] := 
Accumulate[Flatten[Map[Flatten[{#[[1]], Differences[#]}] &, 
       Split[series, Less]]]]
High Performance Mark
Ooh, nice! Thanks Mark! This looks better than mine. I bet it will be faster. You could follow the example in my solution to wrap this up as one function, with your FoldList thing as a helper function.
dreeves
Oops, just tried it and I believe it's not quite to spec: considering only y-values, {1,2,3,1,2,3} should get transformed to {1,2,3,4,5,6}. Ie, the decrease from 3 to 1 means that 3 should get added to all subsequent values, starting with the 1.
dreeves
This is strange. It works but the first time I run it it outputs this warning: "Last::normal: Nonatomic expression expected at position 1 in Last[x]."
dreeves
Also, it's super slow on huge lists for some reason.
dreeves
I would guess it is super slow because of "//. {a___, x_List, y_List, z___}"
Davorak
+2  A: 

I did it using mostly Split, Flatten and Accumulate. I'm not sure the end result is easier to understand than the For loop, but it should be nice and fast if it matters.

monotonize[list_] := 
  With[{splits = Split[list, LessEqual]},
   With[{diffs = Most[Last /@ splits] - Rest[First /@ splits]},
    Flatten[
      MapThread[Plus, {Accumulate[Prepend[diffs, 0]], splits}],
     1]]];

monotonizeSecond[list_] :=
  With[{firsts = First /@ list, lasts = Last /@ list},
   Transpose[{firsts, monotonize@lasts}]];

I think the copious use for With makes it a little clearer than a solution that relied more on anonymous functions woulld. Also, monotonize seems like the kind of thing that could be useful on "undecorated" lists, so I broke it out as a separate function.

Pillsy
Thanks Pillsy. I believe this has the same small deviation from the spec as Mark's first version (your versions probably make more sense for a general monotonize function). For example, {1,2,1} should transform to {1,2,3} not {1,2,2}. See also my comment on Mark's answer.
dreeves
This does seem to be fastest. Using the double transpose trick instead of what you did in monotonizeSecond speeds it up slightly more still.
dreeves
+1 Any reason why you use First /@ list rather then list[[All,1]]?
Davorak
@Davorak: I find the `[[]]` syntax extremely hard to read when an indexed array is used as a function argument, and editing and refactoring often causes that to happen, so I avoid using the syntax reflexively.
Pillsy
+1  A: 

Fundamentally what makes this challenging is that most functional operators in Mathematica operate on one element of a list at a time. This is not the only option however these functions could have been set up to take two adjacent elements of a list at a time, this hypothetical function would make it trivial to get the desired result.

Instead of transforming the function we can easily transform the data using Partition.

Clear[monotonify];
monotonify[data_] := 
 Transpose[{data[[All, 1]], 
  Rest@FoldList[
    If[#2[[1]] < #2[[2]], #1 + #2[[2]] - #2[[1]], #1 + #2[[2]]] &, 0, 
    Partition[data[[All, 2]], 2, 1, {2, -1}, 0]]}]

This version I refactored to add a helper function to make it clear how the function folded over works, but mathematica does not optimize it as well.

Clear[monotonify, m00];
m00[acc_, {prev_, next_}] := 
 If[prev < next, acc + next - prev, acc + next]
monotonify[data_] := 
 Transpose[{data[[All, 1]], 
  Rest@FoldList[m00, 0, Partition[data[[All, 2]], 2, 1, {2, -1}, 0]]}]

edit: forgot some {}

Davorak
With the limited data sets I tested over it seems to be faster then dreeves first answer.
Davorak
First, I confirmed that this is correct; thanks Davorak! Using Partition is a good idea. As for speed, I used this test data: Transpose@{Range[10^6], RandomReal[{0, 1}, {10^6}]}. Interestingly, on that data, your second version is faster and is the same speed as my version and your first version is slower.
dreeves
Try lst = Transpose@{Range[10^6], RandomInteger[{1, 100}, {10^6}]}. But for some reason the same function takes longer with a integer range ~10000. It seems like some internal optimization trade off is going on.
Davorak
+3  A: 

Here is another solution:

Module[{corr, lasts},
 lasts = data[[All, 2]]; 
 corr = Prepend[Accumulate[MapThread[If[#1 > #2, #1, 0] &, {Most[lasts], Rest[lasts]}]], 0];
 Transpose[{data[[All, 1]], lasts + corr}]]

It computes a correction vector that is then added to the y-values of the given data points.

sakra