views:

243

answers:

6

hi. In numerical oriented languages (Matlab, Fortran) range operator and semantics is very handy when working with multidimensional data. For example:

A(i:j,k,:n) // represents two-dimensional slice B(i:j,0:n)  of A at index k

unfortunately C++ does not have range operator (:). of course it can be emulated using range/slice functor, but semantics is less clean than Matlab. I am prototyping matrix/tensor domain language in C++ and am wondering if there any options to reproduce range operator. I still would like to rely on C++/prprocessor framework exclusively.

So far I have looked through boost wave which might be an suitable option.

is there any other means to introduce new non-native operators to C++ DSL?

I know you cannot add new operators.am specifically looking for workaround. One thing I came up (very ugly hack and I do not intend to use):

#define A(r) A[range(((1)?r), ((0)?r))] // assume A overloads []
A(i:j); // abuse ternary operator
+2  A: 

No -- you can't define your own operators in C++. Bjarne Stroustrup details why..

Billy ONeal
+2  A: 

A solution that I've used before is to write an external preprocessor that parses the source and replaces any uses of your custom operator with vanilla C++. For your purposes, a : b uses would be replaced with something like a.operator_range_(b), and operator:() declarations with declarations of range_ operator_range_(). In your makefile you then add a rule that preprocesses source files before compiling them. This can be done with relative ease in Perl.

However, having worked with a similar solution in the past, I do not recommend it. It has the potential to create maintainability and portability issues if you do not remain vigilant of how source is processed and generated.

Jon Purdy
that is an idea. In principle I can replace `: ` with some other binary global operator which generates range and use `, `to append range. I am aware of potential problems, just want to try it out, some sort of rapid application development tool
aaa
Good luck. Beware of crashing into `?:` and labels for access specifiers, switch cases, and (if you use them) gotos. I would actually recommend using something like `..` because it'd be easier to parse.
Jon Purdy
+2  A: 

As Billy said, you cannot overload operators. However, you can come very close yo what you want with "regular" operator overloading (and maybe some template metaprogramming). It would be quite easy to allow for something like this:

#include <iostream>

class FakeNumber {
    int n;
public:
    FakeNumber(int nn) : n(nn) {}
    operator int() const { return n; }
};

class Range {
    int f, t;
public:
    Range(const int& ff, const int& tt) : f(ff), t(tt) {};
    int from() const { return f; }
    int to() const { return t; }
};

Range operator-(const FakeNumber& a, const int b) {
    return Range(a,b);
}

class Matrix {
public:
    void operator()(const Range& a, const Range& b) {
        std::cout << "(" << a.from() << ":" << a.to() << "," << b.from() << ":" << b.to() << ")" << std::endl;
    }
};

int main() {
    FakeNumber a=1,b=2,c=3,d=4;
    Matrix m;
    m(a-b,c-d);

    return 0;
}

The downside is that This solution doesn't support all-literal expressions. Either from or to have to be user-defined classes, since we can't overload operator- for two primitive types.

You can also overload operator* to allow specifying stepping, like so:

m(a-b*3,c-d); // equivalent to m[a:b:3,c:d]

And overload both versions of operator-- to allow ignoring one of the bounds:

m(a--,--d); // equivalent to m[a:,:d]

Another option is to define two objects, named something like Matrix::start and Matrix::end, or whatever you like, and then instead of using operator--, you could use them, and then the other bound wouldn't have to be a variable and could be a literal:

m(start-15,38-end); // This clutters the syntax however

And you could of course use both ways.

I think it's pretty much the best you can get without resorting to bizarre solutions, such as custom prebuild tools or macro abuse (of the sort Matthieu presented and suggested against using them:)).

conio
+1  A: 

The easiest solution is to use a method on matrix instead of an operator.

A.range(i, j, k, n);

Note that typically you do not use , in a subscript operator [], eg A[i][j] instead of A[i,j]. The second form could be possible by overloading the comma operator but then you force i and j to be objects not numbers.

You could define a range class that could be used as a subscript for your matrix class.

class RealMatrix
{
public:
    MatrixRowRangeProxy operator[] (int i) {
        return operator[](range(i, 1));
    }

    MatrixRowRangeProxy operator[] (range r);

    // ...

    RealMatrix(const MatrixRangeProxy proxy);
};

// A generic view on a matrix
class MatrixProxy
{
protected:
    RealMatrix * matrix;
};


// A view on a matrix of a range of rows
class MatrixRowRangeProxy : public MatrixProxy
{
public:
    MatrixColRangeProxy operator[] (int i) {
        return operator[](range(i, 1));
    }

    MatrixColRangeProxy operator[] (const range & r);

    // ...
};

// A view on a matrix of a range of columns
class MatrixColRangeProxy : public MatrixProxy
{
public:
    MatrixRangeProxy operator[] (int i) {
        return operator[](range(i, 1));
    }

    MatrixRangeProxy operator[] (const range & r);

    // ...
};

Then you can copy a range from one matrix into another.

RealMatrix A = ...
RealMatrix B = A[range(i,j)][range(k,n)];

Finally by creating a Matrix class that can hold either a RealMatrix or a MatrixProxy you can make a RealMatrix and a MatrixProxy appear the same from the outside.

Note the operator[] on the proxies are not and cannot be virtual.

iain
A: 

If you want to have fun, you may check out IdOp.

If you are really working on a project, I don't suggest using this trick though. Maintenance will suffer from clever tricks.

Your best bet is thus to bite the bullet and use explicit notation. A short function called range which yields a custom defined object for which the operators are overloaded seems especially suitable.

Matrix<10,30,50> matrix = /**/;
MatrixView<5,6,7> view = matrix[range(0,5)][range(0,6)][range(0,7)];
Matrix<5,6,7> n = view;

Note that the operator[] only has 4 overloads (const/non-const + basic int / range) and yields a proxy object (until the last dimension). Once applied to the last dimension, it gives a view of the matrix. A normal matrix may be built from a view that has the same dimensions (non-explicit constructor).

Matthieu M.
Rather than play around with `operator[]()`, I'd rather define a `Matrix` member function that would take a subview. Overloading operator functions can cause unexpected problems.
David Thornley
Hum, yes even better :p It would save on the typing and gain in readability too!
Matthieu M.
+2  A: 

An alternative is to build a C++ variant dialect using a program transformation tool.

The DMS Software Reengineering Toolkit is a program transformation engine, with an industrial strength C++ Front End. DMS, using this front end, can parse full C++ (it even has a preprocessor and can retain most preprocessor directives unexpanded), automatically build ASTs and complete symbol tables.

The C++ front end comes in source, using a grammar derived directly from the standard. It is technically straightforward to add new grammar rules including those that would allow ":" syntax as array subscripts as you have described, and as Fortran90+ has implemented. One can then use the program transformation capability of DMS to transform the "new" syntax into "vanilla" C++ for use in conventional C++ compilers. (This scheme is a generalization of the Intentional Programming model of "add DSL concepts to your language").

We in fact did a concept demonstration of "Vector C++" using this approach.

We added a multidimensional Vector datatype, whose storage semantics are only that array elements are distinct. This is different than C++'s model of sequential locations, but you need this different semantic if you want the compiler/transformer to have freedom to lay out memory arbitrarily, and this is fundamental if you want to use SIMD machine instructions and/or efficient cache accesses along different axes.

We added Fortran-90 style scalar and subarray range accesses, added virtually all of F90's array-processing operations, added a good fraction of APL's matrix operations, all by adjusting the DMS C++ grammar.

Finally, we built two translators using DMS transformational capability: one mapping a significant part of this (remember, this was a concept demo) to vanilla C++ so you could compile and run Vector C++ applications on a typical workstation, and the other mapping C++ to a PowerPC C++ dialect with SIMD instruction extensions, and we generated SIMD code that was pretty reasonable we thought. Took us about 6 man-months to do all this.

The customer for this ultimately bailed out (his business model didn't include supporting a custom compiler in spite of his severe need for parallel/SIMD based operations), and it has been languishing on the shelf. We've chosen not to pursue this in the broader market because it isn't clear what the market really is. I'm pretty sure there are organizations for which this would be valuable.

Point is, you really can do this. It is almost impossible using ad hoc methods. It is technically quite straightforward with a strong enough program transformation system. It isn't a walk in the park.

Ira Baxter
thanks. I cannot use non-free software (academic funds-free project) but I found some software (rose) which seems to have some of those facilities
aaa
Rose does have some of DMS's capability. But it uses the EDG C++ front end, which AFAIK is a hand-written C++ parser. Grafting your desired changes into the EDG front end will likely be considerably harder than modifying a grammar (which DMS uses) and may break how the rest of the EDG front end/Rose collect data. It won't be a walk in park with Rose, either, but that's a least a choice where you have a change of succeeding. Good luck.
Ira Baxter