NVidia CRTC Registers Mapping

Started by
15 comments, last by mp3man2000 19 years, 11 months ago
Hi there, if anyone can help or point me in some direction it would be most appreciated. I am writing an application that will synchronise the video output of a graphics card (NVidia) to an incomming program stream (MPEG4). The reason for this is to stop horizontal tearing. I have managed to enumerate the video card on the PCI Bus, gather it''s Memory Start Address information and I can directy read the display ram. I beleive I have found the start address for the memory mapped CRTC registers etc. My problem is that I now have no idea as to where and what all the appropriate registers are. I am trying to do this because I believe that if I poll the buffer status register, I should be able to determine when a third party video decoder is trying to do a screen refresh with a new frame of video. And by using this in conjunction with the vertical retrace I could then make adjustments to the timing registers to synchronise the whole thing. The result should be silky smooth video with no dropped or repeated frames. If anyone has had some experience with talking directly to NVidia graphics cards I would be most appreciative of any advice I can get. Thank you in advance.
Advertisement
You sure you want to do that? Nasty business. I had a go at an init-only driver (just setting the video mode), and didn''t manage to produce a stable mode. In fact, this is about where I realized writing an OS with drivers alone is hopeless

Anyway, here are some snippets from code I had as a reference; maybe you can plug it into google and see what comes up.
[table i compiled of interesting regs for mode set]var		where (mmio+x)	valueram bandwidth:	0x00100000	0			PFBmem type:	0x00100200	0x08c10110cfg1:		0x00100204	0mem width:	0x00101000	0x803fc447		PEXTDEVoffset0:	0x00400640	0			PGRAPHoffset1:	0x00400644	0offset2:	0x00400648	0offset3:	0x0040064C	0pitch0:		0x00400670	0pitch1:		0x00400674	0pitch2:		0x00400678	0pitch3:		0x0040067C	0cursor2:	0x00680300	0x01800200		PRAMDACpll:		0x00680500	0x0001fb09pll:		0x00680504	0x00003302vpll:		0x00680508	0x0002d308pllsel:		0x0068050C	0x10000700general:	0x00680600	0x00100100[load/unload hw state]    VGA_WR08(0x03D4, 0x19);    VGA_WR08(0x03D5, state->repaint0);    VGA_WR08(0x03D4, 0x1A);    VGA_WR08(0x03D5, state->repaint1);    VGA_WR08(0x03D4, 0x25);    VGA_WR08(0x03D5, state->screen);    VGA_WR08(0x03D4, 0x28);    VGA_WR08(0x03D5, state->pixel);    VGA_WR08(0x03D4, 0x2D);    VGA_WR08(0x03D5, state->horiz);    VGA_WR08(0x03D4, 0x1B);    VGA_WR08(0x03D5, state->arbitration0);    VGA_WR08(0x03D4, 0x20);    VGA_WR08(0x03D5, state->arbitration1);    VGA_WR08(0x03D4, 0x30);    VGA_WR08(0x03D5, state->cursor0);    VGA_WR08(0x03D4, 0x31);    VGA_WR08(0x03D5, state->cursor1);    chip->PRAMDAC[0x00000300/4]  = state->cursor2;    chip->PRAMDAC[0x00000508/4]  = state->vpll;    chip->PRAMDAC[0x0000050C/4]  = state->pllsel;    chip->PRAMDAC[0x00000600/4]  = state->general;    chip->CURSOR           = &(chip->PRAMIN[0x00010000/4 - 0x0800/4]);    chip->CURSORPOS        = &(chip->PRAMDAC[0x0300/4]);    chip->VBLANKENABLE     = &(chip->PCRTC[0x0140/4]);    chip->VBLANK           = &(chip->PCRTC[0x0100/4]);    state->cursor1      = VGA_RD08(0x03D5);    state->cursor2      = chip->PRAMDAC[0x00000300/4];    state->vpll         = chip->PRAMDAC[0x00000508/4];    state->pllsel       = chip->PRAMDAC[0x0000050C/4];    state->general      = chip->PRAMDAC[0x00000600/4];    state->config       = chip->PFB[0x00000200/4];    state->offset0  = chip->PGRAPH[0x00000640/4];    state->offset1  = chip->PGRAPH[0x00000644/4];    state->offset2  = chip->PGRAPH[0x00000648/4];    state->offset3  = chip->PGRAPH[0x0000064C/4];    state->pitch0   = chip->PGRAPH[0x00000670/4];    state->pitch1   = chip->PGRAPH[0x00000674/4];    state->pitch2   = chip->PGRAPH[0x00000678/4];[page flipping routine]    int offset = start >> 2;    int pan    = (start & 3) << 1;    unsigned char tmp;    /*     * Unlock extended registers.     */    chip->LockUnlock(chip, 0);    /*     * Set start address.     */    VGA_WR08(0x3D4, 0x0D); VGA_WR08(0x3D5, offset);    offset >>= 8;    VGA_WR08(0x3D4, 0x0C); VGA_WR08(0x3D5, offset);    offset >>= 8;    VGA_WR08(0x3D4, 0x19); tmp = VGA_RD08(0x3D5);    VGA_WR08(0x3D5, (offset & 0x01F) | (tmp & ~0x1F));    VGA_WR08(0x3D4, 0x2D); tmp = VGA_RD08(0x3D5);    VGA_WR08(0x3D5, (offset & 0x60) | (tmp & ~0x60));    /*     * 4 pixel pan register.     */    offset = VGA_RD08(chip->IO + 0x0A);    VGA_WR08(0x3C0, 0x13);    VGA_WR08(0x3C0, pan);} 
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
Thank you very much for your reply. The information you have supplied is nearly exactly what I am after, do you have any more info? Where did you get this from.
did you use these values as offsets from the returned Start Memory Addresses values from the card?

Is there any more info for each of the registers you have listed.
I am now about to lock up my PC.
Yes, those addresses are offsets in the card''s memory-mapped register file (base address is the smaller memory range indicated in PCI config space). The code accesses them as 0x??/4 because the array elements are declared as 32 bits.

It''s been years, and my references have evaporated (one HD loss too many? :/). Anyway, I think the Nvidia-specific code was from riva_hw.c (should be easy to find). Also try XFree86.
I was surprised how much runs over the old port I/O interface. Ralf Brown''s port list and the VGA Guide helped there.

I would love to see register docs as well (at least the mode set stuff; don''t have to expose 3d FIFOs), but all NVidia released was this one piece of code for Riva TNT and Geforce 4-5 years back or so. Oh well, better than nothing.

> I am now about to lock up my PC.
hehe, ain''t that the truth. I crashed my machine so often, it wasn''t even funny
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
Well after a bit more searching I found the source that you mentioned thank you, (nv4ref.h is one of the usefull ones) and I had a look at some of those address ranges that were obvious and added them to the returned start address from the card, and bingo it was there (I used the start gwhcursor). That is a good one because iy shows you where the Hardware Cursor (mouse) is.

I actually was hoping that the CRTC settings were mapped there somewhere as well, however it doesn't look like they are, but just using the old out and in gets those values alright.

I have been looking at the IO CRTC registers while adjusting values in PowerStrip and I can see them changing - cool.

Now the memory mapped ones, I am a little confused about the PLL settings, for instance I watch the VPLL M N and P values when I change the pixel clock in PowerStrip, which I can see changing but they don't make a lot of sense at the moment. I have programmed PLLs before but I am not sure whether these are divider values or not, eg. at 102.000 MHz they are 0x09 0x44 0x01 and at 108.000MHz they are 0x02 0x10 0x01. The fourth byte doesn't seem to be used it is just 0x00 all the time and in the riva file it seems that that's how it is.

Have you had any experience with the PLL values ? maybee the bytes or nibbles are reverse ordered or inverted ??

[edited by - mp3man2000 on April 22, 2004 7:40:54 PM]
> That is a good one because iy shows you where the Hardware Cursor (mouse) is.
Indeed useful. More so if its position is set in the ISR - no DPC needed, cursor shows even if OS is hosed

> I actually was hoping that the CRTC settings were mapped there somewhere as well, however it doesn''t look like they are, but just using the old out and in > gets those values alright.
Yep.

quote:Now the memory mapped ones, I am a little confused about the PLL settings, for instance I watch the VPLL M N and P values when I change the pixel clock in PowerStrip, which I can see changing but they don''t make a lot of sense at the moment. I have programmed PLLs before but I am not sure whether these are divider values or not, eg. at 102.000 MHz they are 0x09 0x44 0x01 and at 108.000MHz they are 0x02 0x10 0x01. The fourth byte doesn''t seem to be used it is just 0x00 all the time and in the riva file it seems that that''s how it is.

My experience with programming the PLL was a video mode that sort of ''swam'' on the monitor But looking at riva_hw.c, we can figure it out: chip->CrystalFreqKHz = (chip->PEXTDEV[0x00000000/4] & 0x00000020) ? 14318 : 13500; I think 13.5 MHz is what we have. MClk = (N * chip->CrystalFreqKHz / M) >> P; for m = 9, n = 68, we get 102 / 2. m = 2, n = 16 yields 108 / 2. P messes things up, it''s shifted down 1 bit. Maybe the card wants it that way?

Interesting stuff..
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
OK, I was confused when you first put up the math for calculating the pixel clock. That last shift was confusing.

I looked up the offending code and worked out what you meant. After thinking about it for a while and doing some more calculations, it hit me that the clock source must me 27MHz, this makes that formula work ok. It actually makes sense because the card I am talking to is a GeForce4 Ti, which probably means that they have standardised the clock frequency to what it probably should be. 27MHz clocks have been used in professional video equipment for years, because the correct horizontal frequencies can be derived from it.

I am still a bit confused by where you read the clock frequency from (what is the offset?)....

Also the vertical retrace bit doesn't seem to be implemented any more, I have tried enabling and disabling it, and indeed when I do OpenGL buffer swaps with it enabled it does limit the framerate to the refresh rate, but I need to be able to poll something so I know when they are. Unfortunately starting an OpenGL instance and waiting for the buffer swap function to return is not an elegant way of doing it especially when I won't actually be drawing any video.

Maybee I have to read the current drawing position in the video buffer (is that even possible?)..

Thanks again for your time in replying to these posts, it is much more fun bouncing ideas off someone else, and by the way I havn't locked the PC up again yet.

If you were wondering, I am actually doing all this under WINXP, which has made reading and writing to physical memory and IO ports interesting. So far I have found the best way has been to use the TVicHW32 library, it has been fantastic and windows hasn't complained once....

[edited by - mp3man2000 on April 22, 2004 11:33:09 PM]
Yeah. Interesting about the 27 MHz; I don''t know either why everything is divided by 2.

EXTDEV region offset: googling PEXTDEV yields "nv->pextdev = mmio+0x00101000/4" mmio is a ulong*, so 0x00101000 is the byte offset.
BTW, the nvidia.c that came up is very interesting - will have a closer look later.

quote:Also the vertical retrace bit doesn''t seem to be implemented any more, I have tried enabling and disabling it, and indeed when I do OpenGL buffer swaps with it enabled it does limit the framerate to the refresh rate, but I need to be able to poll something so I know when they are. Unfortunately starting an OpenGL instance and waiting for the buffer swap function to return is not an elegant way of doing it especially when I won''t actually be drawing any video.

You mean the bit in reg 0x3da (IIRC) isn''t toggled anymore on refresh? That''s bad.
Another idea: do you have access to the VESA VBE functions? ah=0x4f, int 0x10. Those are a standardized way of talking to SVGA cards, a reaction to the mess of card-specific code. In particular, there was a triple-buffering function that let you schedule a display flip after the next vsync. Maybe you could use that?


quote:Maybee I have to read the current drawing position in the video buffer (is that even possible?)..
hmm, interesting idea. Always staying behind that would eliminate flickering and tearing, but I have no idea how to get at the counter / if it''s even possible.

quote:Thanks again for your time in replying to these posts, it is much more fun bouncing ideas off someone else, and by the way I havn''t locked the PC up again yet.

Glad to. hehe

quote:If you were wondering, I am actually doing all this under WINXP, which has made reading and writing to physical memory and IO ports interesting. So far I have found the best way has been to use the TVicHW32 library, it has been fantastic and windows hasn''t complained once....

Was about to ask Costs though. Writing a simple WDM driver that does port I/O on behalf of the app is on my todo.
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3
I havn't actually made much progress since my last post, well not on what I am supposed to. I have however had fun learning about opengl. I am playing with surface and lighting at the moment, I will have a bit of a play with textures next I think.

Anyway I still havn't had a reply from NVIDIA, I suspect that there is a lot of information that can be extracted from the new cards, considering that the info we are playing with is for the old RIVA card. Anyway I will have a bit more of a look at the OpenGL tutes....

PS. I wrote a quik program that itterates through all the possible M N & P values for the PLL, it removes repeated frequencies, then sorts by frequency order. The results are returned in an include file. I did this because I was wondering what the granularity could be with regards to pll frequencies, I will throw in a quick snippet. This is only the start but you will get the general Idea.


#ifndef DLS_NVPLL_tbl
#define DLS_NVPLL_tbl


typedef struct
{
long int f;
int m;
int n;
int p;
} _DLS_NVPLL_tbl;

_DLS_NVPLL_tbl DLS_NVPLL_tbl[]=
{
{ 45059,252,151, 0}, { 45060,247,148, 0},
{ 45062,242,145, 0}, { 45063,237,142, 0},
{ 45064,232,139, 0}, { 45066,227,136, 0},
{ 45067,222,133, 0}, { 45069,217,130, 0},
{ 45070,212,127, 0}, { 45072,207,124, 0},
{ 45074,202,121, 0}, { 45076,197,118, 0},
{ 45078,192,115, 0}, { 45080,187,112, 0},
{ 45082,182,109, 0}, { 45084,177,106, 0},
{ 45087,172,103, 0}, { 45090,167,100, 0},
{ 45092,162, 97, 0}, { 45095,157, 94, 0},
{ 45098,152, 91, 0}, { 45102,147, 88, 0},
{ 45105,142, 85, 0}, { 45109,137, 82, 0},
{ 45113,132, 79, 0}, { 45118,127, 76, 0},
{ 45120,249,149, 0}, { 45123,122, 73, 0},
{ 45125,239,143, 0}, { 45128,117, 70, 0},
{ 45131,229,137, 0}, { 45134,112, 67, 0},
{ 45137,219,131, 0}, { 45140,107, 64, 0},
{ 45144,209,125, 0}, { 45147,102, 61, 0},
{ 45151,199,119, 0}, { 45155, 97, 58, 0},
{ 45159,189,113, 0}, { 45163, 92, 55, 0},
{ 45168,179,107, 0}, { 45173, 87, 52, 0},
{ 45178,169,101, 0}, { 45180,251,150, 0},
{ 45183, 82, 49, 0}, { 45187,241,144, 0},
{ 45189,159, 95, 0}, { 45191,236,141, 0},
{ 45195, 77, 46, 0}, { 45200,226,135, 0},
{ 45202,149, 89, 0}, { 45204,221,132, 0},
{ 45209, 72, 43, 0}, { 45214,211,126, 0},

etc etc.............

[edited by - mp3man2000 on April 24, 2004 4:27:26 PM]

[edited by - mp3man2000 on April 25, 2004 3:31:29 AM]
> Anyway I still havn''t had a reply from NVIDIA
hehehe.. forget it

quote:PS. I wrote a quik program that itterates through all the possible M N & P values for the PLL, it removes repeated frequencies, then sorts by frequency order. The results are returned in an include file. I did this because I was wondering what the granularity could be with regards to pll frequencies, I will throw in a quick snippet.
Interesting. Were you able to program all of those? I recall the VBE return-nearest-pclk routine having a much higher granularity than those 3..5 KHz. Wonder why; maybe a limitation on the BIOS end? It may have been table-driven instead trying out all fractions
E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

This topic is closed to new replies.

Advertisement