views:

54

answers:

1

I have a repeating signal that varies a little bit with each cycle of a process that repeats roughly every second, though the duration and the contents of each cycle vary from each other a little bit within some parameters. There are a thousand x,y coordinates for every second of my signal data. A small, but important, segment of the data within each cycle is corrupted, and I want to replace each corrupted segment with an upward facing parabola.

For each data segment that needs to be replaced by the parabola, I have the x,y coordinates of three points. The vertex/minimum is one of those points. And the other two points are the left and right tops of the upward-facing U-shape that is the parabola. In other words, the left top is the x,y coordinate pair of the lowest x value in the domain of this function, while the right top is the x,y coordinate pair of the highest x value in the domain of this function. The y-coordinates of the left top and right top are equal to each other, and are the two highest y-values in the data segment.

How can I write the code to plot the remaining data points in this upward facing parabola? Remember that this function needs to be called 60 or 70 times for every minute of data, and that the shape/formula of the parabola will need to change every time this function is called, in order to account for different relationships between these three pairs of x,y coordinates in each resulting parabola.

def ReplaceCorruptedDataWithParabola(Xarray, Yarray, LeftTopX, LeftTopY
                                     , LeftTopIndex, MinX, MinY, MinIndex
                                     , RightTopX, RightTopY, RightTopIndex):  

    # Step One: Derive the formula for the upward-facing parabola using 
    # the following data from the three points:
        LeftTopX,LeftTopY,LeftTopIndex  
        MinX,MinY,MinIndex  
        RightTopX,RightTopY,RightTopIndex 

    # Step Two: Use the formula derived in step one to plot the parabola in
    # the places where the corrupted data used to reside:
    for n in Xarray[LeftTopX:RightTopX]:
        Yarray[n]=[_**The formula goes here**_]

    return Yarray 

Note: Xarray and Yarray are each single-column vectors with data at each index that links the two arrays as sets of x,y coordinates. They are both numpy arrays. Xarray contains time information and does not change, but Yarray contains signal data, including the corrupted segment that will be replaced with the parabolic data that needs to be calculated by this function.

+3  A: 

So, as I understand it, you have 3 points that you want to fit a parabola to.

Normally, it's simplest to just use numpy.polyfit, but if you're really worried about speed, and you're fitting exactly three points, there's no point in using a least-squares fit.

Instead, we have an even-determined system (fitting a parabola to 3 x,y points), and we can get an exact solution with simple linear algebra.

So, all in all, you might do something like this (most of this is plotting the data):

import numpy as np                                                                              
import matplotlib.pyplot as plt                                                                 

def main():
    # Generate some random data
    x = np.linspace(0, 10, 100)
    y = np.cumsum(np.random.random(100) - 0.5)

    # Just selecting these arbitrarly 
    left_idx, right_idx = 20, 50      
    # Using the mininum y-value within the arbitrary range
    min_idx = np.argmin(y[left_idx:right_idx]) + left_idx 

    # Replace the data within the range with a fitted parabola
    new_y = replace_data(x, y, left_idx, right_idx, min_idx)  

    # Plot the data
    fig = plt.figure()
    indicies = [left_idx, min_idx, right_idx]

    ax1 = fig.add_subplot(2, 1, 1)
    ax1.axvspan(x[left_idx], x[right_idx], facecolor='red', alpha=0.5)
    ax1.plot(x, y)                                                    
    ax1.plot(x[indicies], y[indicies], 'ro')                          

    ax2 = fig.add_subplot(2, 1, 2)
    ax2.axvspan(x[left_idx], x[right_idx], facecolor='red', alpha=0.5)
    ax2.plot(x,new_y)                                                 
    ax2.plot(x[indicies], y[indicies], 'ro')

    plt.show()

def fit_parabola(x, y):
    """Fits the equation "y = ax^2 + bx + c" given exactly 3 points as two
    lists or arrays of x & y coordinates"""
    A = np.zeros((3,3), dtype=np.float)
    A[:,0] = x**2
    A[:,1] = x
    A[:,2] = 1
    a, b, c = np.linalg.solve(A, y)
    return a, b, c

def replace_data(x, y, left_idx, right_idx, min_idx):
    """Replace the section of "y" between the indicies "left_idx" and
    "right_idx" with a parabola fitted to the three x,y points represented
    by "left_idx", "min_idx", and "right_idx"."""
    x_fit = x[[left_idx, min_idx, right_idx]]
    y_fit = y[[left_idx, min_idx, right_idx]]
    a, b, c = fit_parabola(x_fit, y_fit)

    new_x = x[left_idx:right_idx]
    new_y = a * new_x**2 + b * new_x + c

    y = y.copy() # Remove this if you want to modify y in-place
    y[left_idx:right_idx] = new_y
    return y

if __name__ == '__main__':
    main()

Example plot

Hope that helps a bit...

Joe Kington
+1 nice answer.
zellus