Jump to content
  • Advertisement

cache_hit

Member
  • Content Count

    777
  • Joined

  • Last visited

Community Reputation

614 Good

About cache_hit

  • Rank
    Advanced Member
  1. Quote:Original post by outRider Quote:Original post by cache_hit I feel like this is being way overcomplicated. Injecting code into running processes is not that hard, and considerations such as absolute / relative addressing are completely not a factor in the equation if done correctly. Once your DLL is loaded into the process's address space, you can just write code the same way you always would. Just like it's any other DLL running in your own process. Set up some IPC mechanism such as a mutex/shared memory, or a named pipe to communicate back and forth to your main Window Detective application. Maybe there's some miscommunication / misunderstanding going on, but I really don't see why we're now talking about writing in assembly language to control the addressing mode of generated code. We need to take a step back and figure out how we got to this point because something has gone terribly wrong if you feel this is warranted just to get code injected into a remote process. It's simple, his DLL is not being loaded into the target process' address space, instead he's allocating memory in the target address space and copying a function into that memory. Said function has only been loaded in and patched for his process, not the target. No I know, my point is why not just inject a DLL into the process? All of this business just disappears then, and aside from it maybe being slightly easier to detect the presence of an external tool spying on the application, I don't see any downsides. Just much simpler code, much clearer method. BTW your other suggestion about moving into a separate compilation is related to my original suggestion much earlier about removing the static keyword, as they give the compiler the ability to perform almost the same optimization. I never got an answer about that, but either way I still think the static keyword needs to be gone.
  2. I feel like this is being way overcomplicated. Injecting code into running processes is not that hard, and considerations such as absolute / relative addressing are completely not a factor in the equation if done correctly. Once your DLL is loaded into the process's address space, you can just write code the same way you always would. Just like it's any other DLL running in your own process. Set up some IPC mechanism such as a mutex/shared memory, or a named pipe to communicate back and forth to your main Window Detective application. Maybe there's some miscommunication / misunderstanding going on, but I really don't see why we're now talking about writing in assembly language to control the addressing mode of generated code. We need to take a step back and figure out how we got to this point because something has gone terribly wrong if you feel this is warranted just to get code injected into a remote process.
  3. Quote:Original post by XTAL256 Background info: I'm using C++ and Visual Studio 2008. My program (Window Detective) injects code into remote processes. This code works fine in debug but crashes in release mode. I think it is because the generated assembly uses absolute addresses for function calls, which will not work in a remote process. *** Source Snippet Removed *** Assuming that this is indeed the cause of the crashing, is there any way i can force the compiler to use relative addressing? More specifically, i want a #pragma switch rather than a compiler or linker option. That way i can use this option only for the function that will be injected into the remote thread. Using absolute addresses won't work in any process if you have a DLL, not just a remote process. There's another force at work here, and I don't think it has anything to do with a linker setting or relative / absolute addressing. From your point of view, it shouldn't matter that it's executing in a remote process. Either way you just end up with a DLL loaded in a process, same as any. In any case, NOTHING will work in a dll with absolute addressing, because the DLL can be loaded in memory anywhere. By definition, absolute addressing is totally incomaptible with DLLs. Why is that function static? Is it a member of a class? If not, that could be your problem. Functions which are static at the file scope can have various types of compiler optimizations performed on them that are not possible otherwise. Also, why do you have a pointer to the GetModuleHandle() function? Just call it directly, GetModuleHandle(inj->moduleName) [Edited by - cache_hit on September 23, 2010 2:02:41 AM]
  4. cache_hit

    random vs fixed sleep interval

    Why don't you just set the priority of the process to the lowest possible value? Then you'll always have free CPU cycles when you want to use the computer for something else. But if nothing else is going on, it will run at full speed.
  5. Quote:Original post by MJP Quote:Original post by cache_hit Everyone always talks about how you should use DOD up front, but really has anyone worked on a commercial game where this has happened? I sure haven't. DOD is always done in the optimization phase, where you need extra performance. The reason for this is that it doesn't matter how fast or slow your game is, if your game can't ship. Shipping a game requires throwing lots of these ideal theoretical situations out the window. There's plenty of things we'd like to do if we were living in an ideal world, but can't because of reality. Writing 300,000 lines of code from the ground up using DOD is one of them. Your game can't ship if it runs at 10 frames per second. For certain platforms and certain categories of games, there's no way in hell you're going to hit 30fps if your engine isn't designed with CPU/memory performance in mind. There's been more than enough shitty PS3 ports to prove that point (or even Xbox 360 ports, for that matter). Besides we're talking about the renderer here and not the entire game...I don't think it's unreasonable to implement your renderer with regards to performant memory-access patterns, while using higher-level code for gameplay-oriented stuff. I guess we're talking about different things then. Because the OP was just asking a general question about virtual functions, not renderer specific. And in that case, I think it's a pretty bad idea to suffer the productivity loss associated with DOD. On the other hand, what would be really interesting is if someone came up with a programming language where SOA had first-class support. I.e. you could actually design your code in an object-oriented fashion, but when creating arrays of such objects it laid them out in memory in SOA fashion, and when writing, for example, objects.foo would not refer to &objects + i*sizeof(objects) + offsetof(objects::foo), but rather to &objects + offsetof(objects::foo) + i * sizeof(objects::foo). Then you could basically have a syntax like this class foo { int a; int b; int c; }; int main() { foo* objects = new foo[20]; foreach (int& x in objects.a) { x *= 2; } foreach (int& x in objects.b) { x *= 3; } foreach (int& x in objects.c) { x *= 4; } } Hey Apoch, wanna add this to Epoch? :D TL;DR - I'm all for DOD when it makes sense. But I'm not convinced it makes sense for the OP.
  6. Everyone always talks about how you should use DOD up front, but really has anyone worked on a commercial game where this has happened? I sure haven't. DOD is always done in the optimization phase, where you need extra performance. The reason for this is that it doesn't matter how fast or slow your game is, if your game can't ship. Shipping a game requires throwing lots of these ideal theoretical situations out the window. There's plenty of things we'd like to do if we were living in an ideal world, but can't because of reality. Writing 300,000 lines of code from the ground up using DOD is one of them.
  7. Quote:Original post by Yann L Quote:Original post by cache_hit Sure you can, in fact it's silly NOT to ignore them until you find out that they're causing you a performance problem. That kind of mindset creates more and more problems in high-performance computing nowadays. People think they can ignore (or just don't understand) cache implications when implementing code. The result is that a same algorithm can run hundreds of times slower, only because it has been implemented without any regards to cache efficiency. And the implementors of said sub-optimal code don't even realize this. I agree, but we're talking about game development, not HPC. There's hundreds of thousands of lines of code in your typical commercial game, and only a very small fraction of this code needs to conform to such performance stanadrds.
  8. Quote:Original post by Yann L Quote:Original post by ApochPiQ If you cannot prove that the algorithmic complexity of a performance issue is the problem, you need to be using a profiler. Actually, you need to be using a profiler anyways. Never, ever, ever, ever guess. Profile. Although obviously sound advice, it doesn't work on all cases. The branch misprediction penalty typically associated with the indirect call generated from the vtable dereference will often not easily show up on a profiler. Err.. Branch misprediction on a function pointer call? It's an unconditional jump. Quote:Original post by Yann L Quote:Original post by ApochPiQ Ignoring cache issues, each indirection adds roughly 2ns on a Core 2 processor The problem is that you cannot ignore cache issues. They can impose a very heavy penalty, sometimes orders of magnitudes more than the actually opcode execution time. Sure you can, in fact it's silly NOT to ignore them until you find out that they're causing you a performance problem. Besides, if you change the above code, then you're basically just going to replace it with a huge if statement. So now you're trading cache misses for branch mispredcitions and potentially many comparison operations. The winner isn't always clear, it depends on usage patterns. If it's in an inner loop, then the address will sometimes be in the TLB every time and lookup will be really fast. If it has equal probability of being any of the subclasses, and there are many subclasses, then the compiler is just going to make a jump table out of it anyway, in which case it's no different. And in the worst case scenario, the if-statement approach can result in slower code too if the prediction rate is low. IMO, unless your code needs to run on a ps3, there's no room for this type of optimization until the last stages of the game, when you're trying to achieve a desired frame rate, or # of people in the world, or that type of thing.
  9. cache_hit

    Memory corruption help

    I second Hodgman's response. Your program is fine, but you can't debug optimized binaries. printf the value and you'll see that it's fine.
  10. Quote:Original post by jwezorek Jeffrey Richter's book Advanced Windows has a chapter in which he discusses all of this stuff. It's been a while so let me google ... yeah, according to the bibliography of this Dr. Dobb's article Chapter 18 of Advanced Windows is called "Breaking Through Process Boundary Walls"; this is the chapter I'm talking about. I'm not sure Advanced Windows is still up to date, however. It apparently has a 4th edition now titled Programming Applications for Microsoft Windows but I don't know if the stuff about crossing process boundary walls is still included; I had the old book. BTW, the newest version of that book is called "Windows via C/C++". It's basically a complete re-write of the original, which is why the naming scheme was changed. Amazing book though, must read.
  11. Can't you just define a structure with a function pointer, use CreateRemoteThread and use a pointer to that structure as the context param, then in your thread proc, cast to pointer to structure and then you have your function pointer? Alternatively, if you wanted to be able to call a variety of functions, then pass the module handle in the structure instead of a function pointer. This is kind of what you said, except instead of the remote thread "getting the module handle" you're giving it to the remote thread at launch.
  12. Quote:Original post by zyrolasting I'm sorry, but that does not work in my case. I appreciate your reply, though. I should be more specific, but note I will be talking about Direct3D9. (MODS: move if needed) I want to automate volatile resource recovery when I lose my D3D Device. The resources all have unique usage IDs, and associate themselves with a centralized global container on construction. *** Source Snippet Removed *** The container would validate all of it's assets on a reset device event using validateAsset(). The template caused me trouble, however. Type T and buildAsset() should be of concern to those subclassing asset, but should NOT be of concern to the container. For example, you can have an asset<VertexBuffer> or asset<Texture> (that need to be built differently), but the container should not care about how they build, or more importantly, what they build. All it should care about is that they have been validated after a certain event. If I use inheritance, I have to toss out comptr<T> and buildAsset(), which are vital to the asset class. I would likely have to use a ton of code repetition or syntax acrobatics if they were gone. asset<T> already has a virtual function, so it's not like adding more virtual functions would be terrible, because you've already accepted the 4 bytes per class vtable hit. So do this: class container { public: void addAsset(asset_base* f); asset_base* getAsset() { ... }; template<class T> asset<T>* getAssetAs() { return static_cast<asset<T>*>(getAsset()); } }; class asset_base { protected: container* mContainer; virtual asset_base* build() = 0; }; template<class T> class asset : public asset_base { private: comptr<T> mPtr; virtual asset<T>* build() //Note co-variant return type { ... } }; Does this still not work?
  13. Quote:Original post by AverageMidget I'm having difficulty understanding your goal, but that's my problem, because I'm dead tired. That's my excuse, in case I'm way off on what I'm about to say. :p I don't think you can rely on a pointers address staying constant. Isn't it the right of the OS to page in-and-out memory, in an as-needed basis? If you calculate a hash, based on a pointers memory address, then the OS needs to page out your memory, it pages it back in but in a different location, your calculated hash isn't valid anymore. Hopefully I don't sound like an idiot. Man, I need to go to sleep. This isn't an issue. yes it can page in/out memory, but that doesn't mean the value of the pointer will change. If the memory is paged out, it just means next the value is read will incur a cache miss and the data will be paged in. The pointer value will always stay constant on a single run of the program though.
  14. Sure, it's possible. In some specialized circumstances you can run into problems related to the fact that pointer values are non-deterministic. So, for example, if you save the game (for example to a save file), and then re-load the pointers won't be the same, and hence they'll be hashed to different locations. There may be other reasons using pointer values can be undesirable as the above example is certainly only problematic in a limited number of situations.
  15. Quote:Original post by Aardvajk Common approach would be to use a std::vector of pointers to your entities. You need to delete them when you are done though: *** Source Snippet Removed *** You might like to look into boost::ptr_vector instead. I would definitely not recommend anything boost related given the OP's apparent experience with C++.
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!