GameMonkey Script

GameMonkey Script Forums
It is currently Mon Nov 20, 2017 3:35 pm

All times are UTC




Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Sat Sep 27, 2008 12:39 am 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
Vector math is very common in games, and due to the way reference types are made and garbage collected in GM it can be extremely slow to do even moderately complex operations on them.

This is a tutorial explaining how to make a native Vector type on the stack that isn't subject to garbage collection or dereferencing (like other user types). Most of the knowledge came from this thread: http://www.gmscript.com/gamemonkey/forum/viewtopic.php?f=6&t=272. I just added the instructions to create the broadcast float type and how to change the = operator. I tried my best to explain how each step in the process works so you can use this knowledge to add your own native types later.

This increases the sizeof(gmVariable) to 20 bytes(3 * 4 for the floats, 4 for the flags, 4 bytes for the type), 4 bytes over the desired 16. This will cause a performance decrease on most CPUs, since data is usually passed around in 128 bit chunks. I have an idea on how to solve this problem. In involves creating a new union in gmVariable with the current union in it, as well as a 12 byte offset short flags variable, and a 14 byte offset short type variable. I need to get the thumbs-up from Greg before even attempting it, as it's a lot of work, and I don't know of any unintended consequences of changing gmType to short, (although I'm currently running it as short waiting for a bug!)

I used a 3D vector, you could change it to 2D and not require any modifications and fit in the 128bit boundary. If you want 4D, then there's nothing you can do to fit inside 128bits. Deciding to jump to 256 bits now might allow for all sorts of crazy additions if you want - but I think I'm happy if I can get it to 128 bits.

There were a few things I kept in mind when doing this. I wanted to keep the gmVariable limited to 128 bits. I failed at this... for now. I wanted the new syntax to be easily understood and mimic GLSL or other graphics/vector units as much as possible. I wanted to minimize changes to GM and use the user type API as much as possible. Even with the minimization, it's still a fair bit of work.

I did some tests, and I have game entities that each frame did this:
local vel = math.Vector();
vel = entity.player.location - entity.location;
vel = vel.normalize();
triangle.set("velocity", vel * entity.speed); //sets the C++ entity's velocity

Under the traditional way of using Vectors I would create over 200kb of extra memory in the VM every second with 100 entities, this way I create zero. I can't measure exactly how much faster it is since it depends on how much memory you've given to GM, and how many temporary GC'd blocks you'd create per frame. For me, it's WAY faster - over an order of magnitude faster for some more involved tracking behavior.

How to do it
gmVariable must be changed to allow for the native vector type. C unions are limited to simple types only, so your Vector class needs to be separate from the data. The easiest way to do this is creating a data structure like so: (I included my vector class at the bottom)
Code:
struct VectorData {
   Real x, y, z; //Real must be a 32 bit float
};

and having the Vector class look like:
Code:
class Vector {
   //all operations
   union {
      struct { Real x, y, z; };
      VectorData v;
   };
};


gmVariable.h
Set the union in gmVariable in gmVariable.h to be:
Code:
union {
   int m_int;
   float m_float;
   VectorData m_vec;
   struct { VectorData m_vec;  uint32 m_flags; } m_bcast_flags; //VectorData member types *MUST* be the same as float_m_float
   gmptr m_ref;
} m_value;


Now change the enum in gmVariable.h to be:
Code:
enum {
   GM_NULL = 0, // GM_NULL must be 0 as i have relied on this in expression testing.
   GM_INT,
   GM_FLOAT,
   GM_BROADCAST_FLOAT,
   GM_VECTOR,

   GM_STRING,
   GM_TABLE,
   GM_FUNCTION,
   GM_USER,     // User types continue from here.

   GM_FORCEINT = GM_MAX_INT,
};


The ordering here is *VERY* important. If you look at the BC_GETIND instruction in gmThread.cpp, you'll find a line that looks like:
Code:
if(operand->m_type > t1) t1 = operand->m_type;

this line of code chooses which operator to invoke based on priority. The larger the index is, the higher priority it has.

The problem with inserting things after GM_FLOAT is that in order to tell if a variable is a reference or a stack type, GameMonkey uses GM_FLOAT as it's upper bound for stack types. Rather than changing the code to check for GM_VECTOR I added a new line just under the enum:
Code:
const int GM_MAX_STACK_TYPE = GM_FLOAT;

then change gmVariable::IsReference() in gmVariable.h to:
Code:
inline bool IsReference() const { return m_type > GM_MAX_STACK_TYPE; }


This adds the two new required types: VectorData and Broadcast Float. A Broadcast Float is just an internal type that's used to assign to Vectors. GM internally passes all data by value, so it's impossible to modify stack variables, ie:

v = math.Vector();
v.x = 4; //this is not possible

The Broadcast Float is like a float, with dimension data tacked on, (that's what the m_bcast_flag variable is for. So you can do something like:
v = math.Vector();
v = (4).x; //sets v.x equal to 4

Now add new constructors to the gmVariable class:
Code:
explicit inline gmVariable(const VectorData &v) {
   SetVector(v);
}
explicit inline gmVariable(const Vector &v) {
   SetVector(v.v);
}


And the matching SetVector functions in gmVariable:
Code:
inline void SetVector(const VectorData &v) {
   m_type = GM_VECTOR;
   m_value.m_vec = v;
}
inline void SetVector(const Vector &v) {
   m_type = GM_VECTOR;
   m_value.m_vec = v.v;
}


gmThread.h
Add member functions to gmThread in gmThread.h:
Code:
inline const VectorData &ThisVector() const {
   return (GetThis()->m_type == GM_VECTOR ? GetThis()->m_value.m_vec : zero_vector.v);
}
inline void PushVector(const Vector &v) {
   PushVector(v.v);
}
inline void PushVector(const VectorData &v) {
   m_stack[m_top].m_type = GM_VECTOR;
   m_stack[m_top++].m_value.m_vec = v;
}
inline bool ParamVector(int a_param, Vector &a_value) const {
   if (a_param >= m_numParameters) {
      return false;
   }
   gmVariable *var = m_stack + m_base + a_param;
   if (var->m_type == GM_VECTOR) {
      a_value = var->m_value.m_vec;
      return true;
   }
   return false;
}
inline Vector ParamVector(int a_param, const Vector &a_default = zero_vector) const {
   if (a_param >= m_numParameters) {
      return a_default;
   }
   gmVariable *var = m_stack + m_base + a_param;
   if (var->m_type != GM_VECTOR) {
      return a_default;
   }
   return var->m_value.m_vec;
}

Add the helper macro to gmThread.h:
Code:
#define GM_CHECK_VECTOR_PARAM(VAR, PARAM) \
   if(GM_THREAD_ARG->ParamType((PARAM)) != GM_VECTOR) { GM_EXCEPTION_MSG("expecting param %d as Vector", (PARAM)); return GM_EXCEPTION; } \
   Vector VAR = GM_THREAD_ARG->Param((PARAM)).m_value.m_vec;


gmCall.h

Add member functions:
Code:
GM_FORCEINLINE void AddParamVector(const Vector &vec) {
   GM_ASSERT(m_machine);
   GM_ASSERT(m_thread);
   m_thread->PushVector(vec);
   ++m_paramCount;
}
bool GetReturnedVector(rsblsb::math::Vector &a_value) {
   if (m_returnFlag && (m_returnVar.m_type == GM_VECTOR)) {
      a_value = m_returnVar.m_value.m_vec;
      return true;
   }
   return false;
}


gmOperations.cpp
Now we'll create the Broadcast float operations and a getdot for ints and floats:

In the function: gmInitBasicType, add getdot to int and float, as well as add a whole new else if for broadcast floats:
Code:
a_operators[O_GETDOT] = gmIntGetDot; //add to the else branch for integers
a_operators[O_GETDOT] = gmFloatGetDot; //add to else branch for floats


add the new else branch for GM_BROADCAST_FLOAT:
Code:
else if (a_type == GM_BROADCAST_FLOAT) {
   a_operators[O_GETDOT] = gmBcastFloatGetDot;
   a_operators[O_BIT_AND] = gmBcastFloatBitAnd;
   a_operators[O_ADD]    = gmBcastFloatOpAdd;
   a_operators[O_SUB]    = gmBcastFloatOpSub;
   a_operators[O_MUL]    = gmBcastFloatOpMul;
   a_operators[O_DIV]    = gmBcastFloatOpDiv;
   a_operators[O_LT]     = gmBcastFloatOpLT;
   a_operators[O_GT]     = gmBcastFloatOpGT;
   a_operators[O_LTE]    = gmBcastFloatOpLTE;
   a_operators[O_GTE]    = gmBcastFloatOpGTE;
   a_operators[O_NEG]    = gmBcastFloatOpNEG;
}

Now create the functions to support getdot on ints and floats:
Code:
void GM_CDECL gmIntGetDot(gmThread * a_thread, gmVariable * a_operands) {
   const std::string op ((static_cast<gmStringObject*>(GM_OBJECT(a_operands[1].m_value.m_ref)))->GetString());
   a_operands->m_type = GM_BROADCAST_FLOAT;
   a_operands->m_value.m_vec.x = a_operands->m_value.m_vec.y = a_operands->m_value.m_vec.z = a_operands->m_value.m_int;
   a_operands->m_value.m_bcast_flags.m_flags = 0;
   if (op.find("x") != std::string::npos) { //use op.find so the axes could be listed in any order
      a_operands->m_value.m_bcast_flags.m_flags |= 1;
   }
   if (op.find("y") != std::string::npos) {
      a_operands->m_value.m_bcast_flags.m_flags |= 2;
   }
   if (op.find("z") != std::string::npos) {
      a_operands->m_value.m_bcast_flags.m_flags |= 4;
   }
}

void GM_CDECL gmFloatGetDot(gmThread * a_thread, gmVariable * a_operands) {
   const std::string op ((static_cast<gmStringObject*>(GM_OBJECT(a_operands[1].m_value.m_ref)))->GetString());
   a_operands->m_type = GM_BROADCAST_FLOAT;
   a_operands->m_value.m_vec.y = a_operands->m_value.m_vec.z = a_operands->m_value.m_vec.x;
   a_operands->m_value.m_bcast_flags.m_flags = 0;
   if (op.find("x") != std::string::npos) {
      a_operands->m_value.m_bcast_flags.m_flags |= 1;
   }
   if (op.find("y") != std::string::npos) {
      a_operands->m_value.m_bcast_flags.m_flags |= 2;
   }
   if (op.find("z") != std::string::npos) {
      a_operands->m_value.m_bcast_flags.m_flags |= 4;
   }
}

These "upgrade" the int/float to a Broadcast float, and set the axis flags, so they can be assinged to a vector

change the INTOTOFLOAT macro to support Broadcast float:
Code:
#define INTTOFLOAT(A) (((A)->m_type == GM_FLOAT || (A)->m_type == GM_BROADCAST_FLOAT) ? (A)->m_value.m_float : (float) (A)->m_value.m_int)


Now add the operations on broadcast floats:
The x, y, z all return broadcast float, not simple floats. Use xf, yf, and zf to return simple floats (we'll make the Vector class match this later)
Code:
void GM_CDECL gmBcastFloatGetDot(gmThread * a_thread, gmVariable * a_operands) {
   const std::string op ((static_cast<gmStringObject*>(GM_OBJECT(a_operands[1].m_value.m_ref)))->GetString());
   a_operands->m_value.m_bcast_flags.m_flags = 0;
   if (op == "x") {
      a_operands->m_type = GM_BROADCAST_FLOAT;
      a_operands->m_value.m_bcast_flags.m_flags |= 1;
   }
   if (op == "y") {
      a_operands->m_type = GM_BROADCAST_FLOAT;
      a_operands->m_value.m_bcast_flags.m_flags |= 2;
   }
   if (op == "z") {
      a_operands->m_type = GM_BROADCAST_FLOAT;
      a_operands->m_value.m_bcast_flags.m_flags |= 4;
   } else  if (op == "xf") {
      a_operands->m_type = GM_FLOAT;
   } else if (op == "yf") {
      a_operands->m_type = GM_FLOAT;
      a_operands->m_value.m_float = a_operands->m_value.m_vec.y;
   } else if (op == "zf") {
      a_operands->m_type = GM_FLOAT;
      a_operands->m_value.m_float = a_operands->m_value.m_vec.z;
   }
}

This function creates the syntax of using a single & to apply different axes to a value.. for instance:
vec = (1).x & (2).y; // vec = (1, 2, 0)
the + operator will be used to add things, so:
vec = (1).xy + (2).y; // vec = (1, 3, 0)
Code:
void GM_CDECL gmBcastFloatBitAnd(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      a_operands[0].m_value.m_bcast_flags.m_flags |= a_operands[1].m_value.m_bcast_flags.m_flags;
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x = a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y = a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z = a_operands[1].m_value.m_vec.z;
      }
   }
}

void GM_CDECL gmBcastFloatOpAdd(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x += a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y += a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z += a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x += (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y += (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z += (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
   }
}
void GM_CDECL gmBcastFloatOpSub(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x -= a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y -= a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z -= a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x -= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y -= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z -= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
   }
}
void GM_CDECL gmBcastFloatOpMul(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x *= a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y *= a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z *= a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x *= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y *= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z *= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
   }
}
void GM_CDECL gmBcastFloatOpDiv(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x /= a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y /= a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z /= a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
#      if GMMACHINE_GMCHECKDIVBYZERO
      if (INTTOFLOAT(a_operands[1] == 0) {
         LOG_ERROR << "Divide by zero error" << nl;
         a_operands->Nullify();
      }
#      else
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x /= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y /= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z /= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int);
      }
#      endif
   }
}

void GM_CDECL gmBcastFloatOpNEG(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
      a_operands[0].m_value.m_vec.x = -a_operands[0].m_value.m_vec.x;
   }
   if (a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
      a_operands[0].m_value.m_vec.y = -a_operands[0].m_value.m_vec.y;
   }
   if (a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
      a_operands[0].m_value.m_vec.z = -a_operands[0].m_value.m_vec.z;
   }
}

The <, <=, >, >= are all weird, and you might not want them, I'm still undecided about it. I don't like them because they break the rule:
if (a < b) then b >= a
because they logically & the comparator operator on the appropriate axes.

I'm unsure if they're even needed, so it's up to you whether you want to include them in your Vector, but here they are anyway:
Code:
void GM_CDECL gmBcastFloatOpLT(gmThread * a_thread, gmVariable * a_operands) {
   int val = 1;
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x >= a_operands[1].m_value.m_vec.x) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y >= a_operands[1].m_value.m_vec.y) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z >= a_operands[1].m_value.m_vec.z) {
            val = 0;
         }
      }
   } else if (val && a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x >= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y >= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z >= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
   }
   a_operands->m_type = GM_INT;
   a_operands->m_value.m_int = val;
}
void GM_CDECL gmBcastFloatOpLTE(gmThread * a_thread, gmVariable * a_operands) {
   int val = 1;
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x > a_operands[1].m_value.m_vec.x) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y > a_operands[1].m_value.m_vec.y) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z > a_operands[1].m_value.m_vec.z) {
            val = 0;
         }
      }
   } else if (val && a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x > (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y > (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z > (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
   }
   a_operands->m_type = GM_INT;
   a_operands->m_value.m_int = val;
}
void GM_CDECL gmBcastFloatOpGT(gmThread * a_thread, gmVariable * a_operands) {
   int val = 1;
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x <= a_operands[1].m_value.m_vec.x) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y <= a_operands[1].m_value.m_vec.y) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z <= a_operands[1].m_value.m_vec.z) {
            val = 0;
         }
      }
   } else if (val && a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x <= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y <= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z <= (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
   }
   a_operands->m_type = GM_INT;
   a_operands->m_value.m_int = val;
}
void GM_CDECL gmBcastFloatOpGTE(gmThread * a_thread, gmVariable * a_operands) {
   int val = 1;
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[0].m_value.m_bcast_flags.m_flags & 1 && a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x < a_operands[1].m_value.m_vec.x) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2 && a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y < a_operands[1].m_value.m_vec.y) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4 && a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z < a_operands[1].m_value.m_vec.z) {
            val = 0;
         }
      }
   } else if (val && a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 1) {
         if (a_operands[0].m_value.m_vec.x < (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 2) {
         if (a_operands[0].m_value.m_vec.y < (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
      if (val && a_operands[0].m_value.m_bcast_flags.m_flags & 4) {
         if (a_operands[0].m_value.m_vec.z < (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : a_operands[1].m_value.m_int)) {
            val = 0;
         }
      }
   }
   a_operands->m_type = GM_INT;
   a_operands->m_value.m_int = val;
}


OVERLOAD THE =

That sets up all the internal types we need, we'll add the Vector operations through the user type API at the end. The last modification needed to GameMonkey is "overload" the = operator.
In GM the = operator always replaces a value with a new value. It's not overloadable by default so doing something like:
vec = math.Vector();
vec = (3).x;
will completely replace vec on line 2 with a broadcast float with the x flag set to true. The desired behavior is to just change the x of the vec. (Remember we can't do vec.x = 3 because vec is a stack variable now, and they're immutable).

In order to do this, we have to change the code where the equals operator gets invoked.
Unfortunately, this is in two different place.
One for local variables, and another for table variables, (global variables are actually table variables)
*I hope this is correct! I'm not 100% sure on this, maybe Greg can confirm?*

gmThread.cpp

First we'll change the local equals operator. Open gmThread.cpp, and find the BC_SETLOCAL set local instruction in the switch statement (line 804)
Code:
case BC_SETLOCAL :
   {
      gmuint32 offset = OPCODE_PTR(instruction);
      gmVariable *newVar = (--top);
      if (base[offset].m_type != GM_VECTOR || newVar->m_type != GM_BROADCAST_FLOAT) {
         base[offset] = *newVar;
      } else {
         //only do special stuff if setting an existing vector to a broadcast float type
         if (newVar->m_value.m_bcast_flags.m_flags & 1) {
            base[offset].m_value.m_vec.x = newVar->m_value.m_float;
         }
         if (newVar->m_value.m_bcast_flags.m_flags & 2) {
            base[offset].m_value.m_vec.y = newVar->m_value.m_vec.y;
         }
         if (newVar->m_value.m_bcast_flags.m_flags & 4) {
            base[offset].m_value.m_vec.z = newVar->m_value.m_vec.z;
         }
      }
      break;
   }


gmTableObject.cpp

Now we'll do the same for table variables. Open gmTableObject, line 216

replace the:
Code:
foundNode->m_value = a_value;

with:
Code:
if (foundNode->m_value.m_type == GM_VECTOR && a_value.m_type == GM_BROADCAST_FLOAT) {
   //only do special stuff if setting an existing vector to a broadcast float type
   if (a_value.m_value.m_bcast_flags.m_flags & 1) {
      foundNode->m_value.m_value.m_vec.x = a_value.m_value.m_float;
   }
   if (a_value.m_value.m_bcast_flags.m_flags & 2) {
      foundNode->m_value.m_value.m_vec.y = a_value.m_value.m_vec.y;
   }
   if (a_value.m_value.m_bcast_flags.m_flags & 4) {
      foundNode->m_value.m_value.m_vec.y = a_value.m_value.m_vec.z;
   }
} else {
   foundNode->m_value = a_value;
}


Create The Vector
That's all for the changes to GM. Now all that's left is adding the operations for GM_VECTOR. All this can be done through the User Type API like you would for any other user type:
Code:
gmFunctionEntry math_lib_functions[] = {
   {"Vector", gm_make_vector},
};
gmFunctionEntry vector_methods[] = {
   {"normalize", gm_vector_normalize},
}; //I have more, just showing the minimum to demostrate functionality

a_machine->RegisterLibrary(math_lib_functions, sizeof(math_lib_functions) / sizeof(math_lib_functions[0]), "math");
a_machine->RegisterTypeOperator(GM_VECTOR, O_GETDOT, nullptr, gm_vector_op_getdot);
a_machine->RegisterTypeOperator(GM_VECTOR, O_BIT_AND, nullptr, gm_vector_op_and);
a_machine->RegisterTypeOperator(GM_VECTOR, O_ADD, nullptr, gm_vector_op_add);
a_machine->RegisterTypeOperator(GM_VECTOR, O_SUB, nullptr, gm_vector_op_sub);
a_machine->RegisterTypeOperator(GM_VECTOR, O_MUL, nullptr, gm_vector_op_mul);
a_machine->RegisterTypeOperator(GM_VECTOR, O_DIV, nullptr, gm_vector_op_div);
a_machine->RegisterTypeLibrary(GM_VECTOR, vector_methods, sizeof(vector_methods) / sizeof(vector_methods[0]));

int GM_CDECL gm_make_vector(gmThread *a_thread) {
   switch (a_thread->GetNumParams()) {
      case 0:
         a_thread->PushVector(Vector());
         break;
      case 1: {
         float num = gmGetFloatOrIntParamAsFloat(a_thread, 0);
         a_thread->PushVector(Vector(num, num, num));
         break;
         }
      case 2:
         a_thread->PushVector(Vector(gmGetFloatOrIntParamAsFloat(a_thread, 0), gmGetFloatOrIntParamAsFloat(a_thread, 1)));
         break;
      case 3:
         a_thread->PushVector(Vector(gmGetFloatOrIntParamAsFloat(a_thread, 0), gmGetFloatOrIntParamAsFloat(a_thread, 1), gmGetFloatOrIntParamAsFloat(a_thread, 2)));
         break;
      default:
         LOG_ERROR << "Invalid number of paramters to math.Vector() - expecting 0, 1, 2, or 3" << nl;
         break;
   }
   return GM_OK;
}
//these match the functionality of the Broadcast Float type
void GM_CDECL gm_vector_op_getdot(gmThread * a_thread, gmVariable *a_operands) {
   const std::string op_name((static_cast<gmStringObject*>(GM_OBJECT(a_operands[1].m_value.m_ref)))->GetString());
   if (op_name == "x") {
      a_operands[0].m_type = GM_BROADCAST_FLOAT;
      a_operands[0].m_value.m_bcast_flags.m_flags = 1;
   } else if (op_name == "y") {
      a_operands[0].m_type = GM_BROADCAST_FLOAT;
      a_operands[0].m_value.m_bcast_flags.m_flags = 2;
   } else if (op_name == "z") {
      a_operands[0].m_type = GM_BROADCAST_FLOAT;
      a_operands[0].m_value.m_bcast_flags.m_flags = 4;
   } else if (op_name == "xy") {
      a_operands[0].m_type = GM_BROADCAST_FLOAT;
      a_operands[0].m_value.m_bcast_flags.m_flags = 3;
   } else if (op_name == "xz") {
      a_operands[0].m_type = GM_BROADCAST_FLOAT;
      a_operands[0].m_value.m_bcast_flags.m_flags = 5;
   } else if (op_name == "yz") {
      a_operands[0].m_type = GM_BROADCAST_FLOAT;
      a_operands[0].m_value.m_bcast_flags.m_flags = 6;
   } else if (op_name == "xyz") {
      a_operands[0].m_value.m_bcast_flags.m_flags = 7;
   }else  if (op_name == "xf") {
      a_operands->m_type = GM_FLOAT;
   } else if (op_name == "yf") {
      a_operands->m_type = GM_FLOAT;
      a_operands->m_value.m_float = a_operands->m_value.m_vec.y;
   } else if (op_name == "zf") {
      a_operands->m_type = GM_FLOAT;
      a_operands->m_value.m_float = a_operands->m_value.m_vec.z;
   } else {
      a_operands->Nullify();
   }
}
void GM_CDECL gm_vector_op_add(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_VECTOR) {
      a_operands[0].SetVector(a_operands[0].m_value.m_vec + a_operands[1].m_value.m_vec);   
   } else if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x += a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y += a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z += a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      float f = (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : (float)a_operands[1].m_value.m_int);
      a_operands[0].m_value.m_vec.x += f;
      a_operands[0].m_value.m_vec.y += f;
      a_operands[0].m_value.m_vec.z += f;
   } else {
      a_operands->Nullify();
      LOG_ERROR << "Vector operator + requires a Vector, number or broadcast float parameter" << nl;
      return;
   }
}
void GM_CDECL gm_vector_op_and(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x = a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y = a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z = a_operands[1].m_value.m_vec.z;
      }
   }
}
void GM_CDECL gm_vector_op_sub(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_VECTOR) {
      a_operands[0].SetVector(a_operands[0].m_value.m_vec - a_operands[1].m_value.m_vec);   
   } else if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x -= a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y -= a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z -= a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      float f = (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : (float)a_operands[1].m_value.m_int);
      a_operands[0].m_value.m_vec.x -= f;
      a_operands[0].m_value.m_vec.y -= f;
      a_operands[0].m_value.m_vec.z -= f;
   } else {
      a_operands->Nullify();
      LOG_ERROR << "Vector operator - requires a Vector, number or broadcast float parameter" << nl;
      return;
   }
}
void GM_CDECL gm_vector_op_mul(gmThread * a_thread, gmVariable * a_operands) {
   if (a_operands[1].m_type == GM_VECTOR) {
      a_operands[0].SetVector(a_operands[0].m_value.m_vec % a_operands[1].m_value.m_vec);
   } else if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x *= a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y *= a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z *= a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      float f = (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : (float)a_operands[1].m_value.m_int);
      a_operands[0].m_value.m_vec.x *= f;
      a_operands[0].m_value.m_vec.y *= f;
      a_operands[0].m_value.m_vec.z *= f;
   } else {
      a_operands->Nullify();
      LOG_ERROR << "Vector operator - requires a Vector, number or broadcast float parameter" << nl;
      return;
   }
}
void GM_CDECL gm_vector_op_div(gmThread * a_thread, gmVariable * a_operands) {
   //no checking for divide by zero cause I don't care.
   if (a_operands[1].m_type == GM_BROADCAST_FLOAT) {
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 1) {
         a_operands[0].m_value.m_vec.x /= a_operands[1].m_value.m_vec.x;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 2) {
         a_operands[0].m_value.m_vec.y /= a_operands[1].m_value.m_vec.y;
      }
      if (a_operands[1].m_value.m_bcast_flags.m_flags & 4) {
         a_operands[0].m_value.m_vec.z /= a_operands[1].m_value.m_vec.z;
      }
   } else if (a_operands[1].m_type == GM_FLOAT || a_operands[1].m_type == GM_INT) {
      float f = (a_operands[1].m_type == GM_FLOAT ? a_operands[1].m_value.m_float : (float)a_operands[1].m_value.m_int);
      a_operands[0].m_value.m_vec.x /= f;
      a_operands[0].m_value.m_vec.y /= f;
      a_operands[0].m_value.m_vec.z /= f;
   } else {
      a_operands->Nullify();
      LOG_ERROR << "Vector operator / requires a number or broadcast float parameter" << nl;
      return;
   }
}
static int GM_CDECL gm_vector_normalize(gmThread *a_thread) {
   Vector v(a_thread->ThisVector());
   v.normalize();
   a_thread->PushVector(v);
   return GM_OK;
}


I'll post my Vector class here too in case anyone's confused about how the operations work or want a more functional Vector class:
Code:
Vector operator + (const Vector &u, const Vector &v);
Vector operator - (const Vector &u, const Vector &v);
Vector operator * (const Vector &v, Real scalar);
Vector operator * (Real scalar, const Vector &v);
Vector operator / (const Vector &v, Real scalar);
Vector operator % (const Vector &a, const Vector &b); //cross product
Vector operator - (const Vector &v);
Real dot(const Vector &u, const Vector &v);
Vector cross(const Vector &u, const Vector &v);
Vector project(const Vector &u, const Vector &v); //v proj u
Vector projectPreNorm( const Vector& u, const Vector& v ); //v proj u if u is unit
Real angleBetween(const Vector &u, const Vector &v);
Real angleBetweenPreNorm( const Vector &u, const Vector &v); //u & v must be normalized
bool areCollinear(const Vector &u, const Vector &v);
std::ostream& operator<<( std::ostream& o, const Vector& v );

struct VectorData {
   Real x, y, z;
};
struct Vector {
   Vector::Vector() {
      memset(&v, 0, sizeof(VectorData));
   }
   Vector::Vector(const Vector &src) {
      memcpy(&v, &src.v, sizeof(VectorData));
   }
   Vector::Vector(const VectorData &vd) {
      memcpy(&v, &vd, sizeof(VectorData));
   }
   Vector::Vector(Real xx, Real yy, Real zz) {
      x = xx;
      y = yy;
      z = zz;
   }
   Vector::Vector(Real xx, Real yy) {
      x = xx;
      y = yy;
      z = 0;
   }
   Vector::Vector(Real scalar) {
      x = y = z = scalar;
   }
   Real lengthSquared() const                                 { return dot(*this, *this); }
   Real length() const                                       { return sqrt(lengthSquared()); }
   void normalize() {
      Real len = length();
      if (len != 0) {
         (*this) /= len;
      }
   }
   Vector unit() const                                       { return ( *this ) / length(); }
   void operator += (const Vector &v) {
      x += v.x;
      y += v.y;
      z += v.z;
   }
   void operator -= (const Vector &v) {
      x -= v.x;
      y -= v.y;
      z -= v.z;
   }
   void operator *= (const Vector &v) {
      x *= v.x;
      y *= v.y;
      z *= v.z;
   }
   void operator *= (Real scalar) {
      x *= scalar;
      y *= scalar;
      z *= scalar;
   }
   void operator /= (Real scalar) {
      x /= scalar;
      y /= scalar;
      z /= scalar;
   }
   void operator /= (const Vector &v) {
      x /= v.x;
      y /= v.y;
      z /= v.z;
   }
   bool operator == (const Vector &v) const {
      if (!areEqual(x, v.x)) {
         return false;
      }
      if (!areEqual(y, v.y)) {
         return false;
      }
      if (!areEqual(z, v.z)) {
         return false;
      }
      return true;
   }
   Vector &operator = (const Vector &vec) {
      memcpy(&v, &vec.v, sizeof(VectorData));
      return *this;
   }
   Vector &operator = (const VectorData &vd) {
      memcpy(&v, &vd, sizeof(VectorData));
      return *this;
   }
   bool operator != (const Vector &v) const {
      return (!areEqual(x, v.x) || !areEqual(y, v.y) || !areEqual(z, v.z));
   }
   inline void perp() {
      std::swap(x, y);
      x = -x;
   }
   union {
      struct { Real x, y, z; };
      VectorData v;
   };
};

inline Vector operator + (const Vector &u, const Vector &v) {
   return Vector(u.x + v.x, u.y + v.y, u.z + v.z);
}
inline Vector operator - (const Vector &u, const Vector &v) {
   return Vector(u.x - v.x, u.y - v.y, u.z - v.z);
}
inline Vector operator * (const Vector &v, Real scalar) {
   return Vector(scalar * v.x, scalar * v.y, scalar * v.z);
}
inline Vector operator * (Real scalar, const Vector &v) {
   return Vector(scalar * v.x, scalar * v.y, scalar * v.z);
}
inline Vector operator / (const Vector &v, Real scalar) {
   return Vector(v.x / scalar, v.y / scalar, v.z / scalar);
}
inline Vector operator / (Real scalar, const Vector &v) {
   return Vector(v.x / scalar, v.y / scalar, v.z / scalar);
}
inline Vector operator % (const  Vector &a, const Vector &b) {
   return Vector( ( a.y * b.z ) - ( a.z * b.y ), ( a.z * b.x ) - ( a.x * b.z ), ( a.x * b.y ) - ( a.y * b.x ) );
}
inline Real dot(const Vector & u, const Vector &v) {
   return u.x * v.x + u.y * v.y + u.z * v.z;
}
inline Vector cross(const Vector &u, const Vector &v) {
   return u % v;
}
inline Vector operator - (const Vector &v) {
   return v * -static_cast<Real>(1);
}
inline Vector project(const Vector& u, const Vector& v) {
   Vector u2 = u;
   u2.normalize();
   return projectPreNorm( u2, v );
}
inline Vector projectPreNorm(const Vector& u, const Vector& v) {
   return dot( u, v ) * u;
}
inline Real angleBetween(const Vector& u, const Vector& v) {
   Vector u2 = u;
   u2.normalize();
   Vector v2 = v;
   v2.normalize();
   return angleBetweenPreNorm( u2, v2 );
}
inline Real angleBetweenPreNorm(const Vector& u, const Vector& v) {
   Real val = dot( u, v );
   val = clamp<Real>(val, -1.0, 1.0);
   return rtod(acos(val));
}
inline bool areCollinear( const Vector &u, const Vector &v) {
   Vector a = u, b = v;
   a.normalize();
   b.normalize();

   if ( areEqual( a.x, b.x ) && areEqual( a.y, b.y ) && areEqual( a.z, b.z ) ) {
      return true;
   }
   if ( areEqual( a.x, -b.x ) && areEqual( a.y, -b.y ) && areEqual( a.z, -b.z ) ) {
      return true;
   }
   return false;
}
inline std::ostream& operator<<(std::ostream &o, const Vector &v) {
   return o << "[" << v.x << " " << v.y << " " << v.z << "]";
}
inline Vector rotate2D(const Vector &v, const Real theta) {
   register const Real st = SIN(theta);
   register const Real ct = COS(theta);
   return Vector(v.x * ct - v.y * st, v.x * st + v.y * ct);
}
inline void lerp2D(const Vector &from_v, const Vector &to_v, const float fraction, Vector &destination) {
   destination.x = from_v.x + fraction * (to_v.x - from_v.x);
   destination.y = from_v.y + fraction * (to_v.y - from_v.y);
}

Closing
That's all there is to it! It may seem like a lot more than it really is. This should only take an hour or two.

If I do change gmType type to be short and change the gmVariable union then I'll be sure to post how to do that here, (I'll probably include a diff of the directory, since every file will probably have to be changed).

I can't get over how well GM was made, other than the equals operator and the operator priority I had no trouble doing this. I tried a few different approaches, and all this took only two days, and I have never looked at the GM source before - that's how simple and well-written it is!

Here's some examples of how to use the vectors with the new added syntax:
Code:
vec = math.Vector(1, 2, 3);
vec2 = vec;
vec = (4).x;

vec.x is 4, vec2.x is 1 since they're stack variables now!

Code:
vec = math.Vector(1, 3);
vec = vec.yf.z;

vec is (1, 3, 3) -> vec.yf is a float that equals 3, then the .z operator gets applied to it, and becomes a broadcast float equal to 3, 3, 3 with the z flag set

Code:
vec = math.Vector(1, 3);
vec = vec.yf.z & vec.y - 1;

sets vec = 1, 2, 3 -> the -1 operator gets applied to the broadcast float vec.y, the & operator sets the z from the vec.yf.z, and the y from vec.y - 1

Code:
vec = math.Vector(1, 3);
vec2 = math.Vector(2, 1);
vec = vec2.x / 4 & vec.y * 5;
vec3 = vec * vec2;

sets vec = (0.5, 15, 0) by doing the / and * on the respected broadcast floats, and the & on the two broadcast floats to & their axis flags together
vec3 = (0, 0, -29.5) -> the cross product of vec and vec2


Last edited by Shawn on Sat Sep 27, 2008 6:56 pm, edited 5 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 1:48 am 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 698
Thanks for sharing that Shawn. You commented that the performance is pretty good. If the example gmVector3 binding has some similar functionality, you might want to benchmark the two for comparison.

You might also like to try some possible optimizations like:
1) Not zeroing new vectors when they are constructed (perhaps as #define in native code option)
2) Store some gmStringObjects for your member names like 'x' etc, then you can compare integer values instead of strings for some operations. (That is the sole reason for gmMachine::AllocPermanantStringObject())
3) Order or nest conditions for large comparison sets, to improve branch prediction.
It's always best to profile first though, to determine what is worth optimizing.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 2:26 am 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
Ahh, I like that AllocPermanantStringObject(), gunna add that, because it's a very easy optimization.

I have the setting to zero because I do that in my engine already, but yeah, you could easily add a #if around it.

I don't think it's worth optimizing the ordering of the comparison because branch prediction works so differently on a cpu to cpu basis.

What do you think about changing gmType to a short?


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 3:04 am 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 698
Shawn wrote:
...What do you think about changing gmType to a short?

Because it is difficult to shrink gmVariable, I think the size of gmType is less critical. Also, on 32bit CPUs like Intel for Windows, there is a cost in using < 32bit values. Instructions are added to resize, and parameters are repacked for function calls etc. You really need to benchmark. Sometimes smaller data wins, other times instruction counts and alignment wins. Feel free to mod like crazy though and post results for a specific platform.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 6:44 am 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
Very cool Shawn. Thanks for the contribution. I wasn't missing the individual component assignments and stuff too much so I didn't bother toying with a solution to my lack of support of them, but your implementation has a pretty nice solution to that. Good job. I'll probably upgrade my local copy with your stuff. One thing I think would be great, if Greg agrees, would be to work this whole thing back into the main GM distribution, perhaps with #defines to make it optional. Even though it bloats up the gmVariable size the performance cost of that is pretty insignificant compared to the work it saves over doing something like a vector class as a user type.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 11:12 am 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
My idea of making gmType a short and sticking it in the value union at the very end, and making the flags value a short and putting it just before type would make gmVariable 128 bits:

x (32 bit float) | y (32 bit float) | z (32 bit float) | flags (16 bit short) | type (16 bit short)

This pretty much eliminates the issue of gmVariable bloat, since 32 bit vs 128 bit would be negligible on most CPUs when the value's being passed around, since call parameters are padded to 128 bit anyway.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 6:57 pm 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
I edited the first post to fix a formatting problem and fix a bug in the vector +, -, *, / operations. I also added a & operation on the Vector that mimics the one for the bcast float.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 7:44 pm 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
how about a diff/patch ?


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 8:15 pm 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
sure, I'll make a diff.

I'm going to have three new defines:
#define GM_VECTOR_LIB_INCLUDE
#define GM_VECTOR_TYPE
#define GM_VECTOR_DATA_TYPE

GM_VECTOR_LIB_INCLUDE will be the include file with your vector class: "Math/Vector.h" for example
GM_VECTOR_TYPE will be something like: math::Vector
GM_VECTOR_DATA_TYPE will be something like math::VectorData
(that's what mine are)


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 27, 2008 8:52 pm 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
I think this is all you need... try it out and lemme know if I forgot anything.
I have other changes that I tried to separate from these, hopefully I didn't screw anything up.
If they work I'll add them to the initial post.

I created them with gnu difftools' diff for windows with the -u option. My clean GameMonkey directory is C:\Game\GameMonkey\gmsrc\src\gm and I ran the diff from the working modified gm directory.

http://www.rsblsb.com/code/gmVector/gmMachine.cpp.diff
http://www.rsblsb.com/code/gmVector/gmO ... s.cpp.diff
http://www.rsblsb.com/code/gmVector/gmT ... t.cpp.diff
http://www.rsblsb.com/code/gmVector/gmThread.cpp.diff
http://www.rsblsb.com/code/gmVector/gmThread.h.diff
http://www.rsblsb.com/code/gmVector/gmVariable.h.diff

You also need these lines in gmConfig_p.h
Code:
#define GM_VECTOR_LIB_INCLUDE      "Math/Vector.h"
#define GM_VECTOR_TYPE         math::Vector
#define GM_VECTOR_DATA_TYPE      math::VectorData
#define GM_VECTOR_DEFAULT      math::zero_vector


Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 29, 2008 4:30 am 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
So changing the union to be 128 bits is very easy:

Code:
union gmValueUnion {
   int m_int;
   float m_float;
   GM_COLOUR_DATA_TYPE m_colour;
   GM_VECTOR_DATA_TYPE m_vec;
   gmptr m_ref;
   void *m_weak_ptr;
};
struct {
   gmValueUnion m_value;
   uint16 m_flags;
   int16 m_type;
};


This changes the m_bcast_flags.m_flags call to be just m_flags. In this case it's sizeof(float) * 3, so 96 bits. Add 16 bits for the flags and 16 for the type and you have 128bits!

I used the flags in other situations too, such as a colour type that works the same as the float, and a type index for weak pointers that I added support for.

If there's interest I'll post about how I did weak pointers.


Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 29, 2008 6:03 pm 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
I'm interested in the weak pointer stuff.


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 21, 2008 4:16 am 
Offline

Joined: Thu Sep 25, 2008 6:28 am
Posts: 24
Greg: Would you be able to give me an example of how to do the fast string comparisons for "x", "y", etc?

I get that you alloc a perm string for them up front, but what do you do with the a_operands variable in the getDot operators? do you have to alloc a perm string for them so it returns the same address?

If that's the case, is it really any faster? Maybe I just don't understand how to do it.

Thanks!


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 21, 2008 6:01 am 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
I believe that if you allocate permanent string objects up front, like AllocPermanantStringObject("x"), you can simply compare the gmStringObject * you get back from that with whatever gmVariable.m_value.m_ref, as any variable that is "x" will be a reference to that string object.


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 21, 2008 6:05 am 
Offline

Joined: Fri Nov 24, 2006 9:50 am
Posts: 165
yep, that's the way to do fast comparisons.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group