Archived

This topic is now archived and is closed to further replies.

mp3man2000

NVidia CRTC Registers Mapping

Recommended Posts

Hi there, if anyone can help or point me in some direction it would be most appreciated. I am writing an application that will synchronise the video output of a graphics card (NVidia) to an incomming program stream (MPEG4). The reason for this is to stop horizontal tearing. I have managed to enumerate the video card on the PCI Bus, gather it''s Memory Start Address information and I can directy read the display ram. I beleive I have found the start address for the memory mapped CRTC registers etc. My problem is that I now have no idea as to where and what all the appropriate registers are. I am trying to do this because I believe that if I poll the buffer status register, I should be able to determine when a third party video decoder is trying to do a screen refresh with a new frame of video. And by using this in conjunction with the vertical retrace I could then make adjustments to the timing registers to synchronise the whole thing. The result should be silky smooth video with no dropped or repeated frames. If anyone has had some experience with talking directly to NVidia graphics cards I would be most appreciative of any advice I can get. Thank you in advance.

Share this post


Link to post
Share on other sites
You sure you want to do that? Nasty business. I had a go at an init-only driver (just setting the video mode), and didn''t manage to produce a stable mode. In fact, this is about where I realized writing an OS with drivers alone is hopeless

Anyway, here are some snippets from code I had as a reference; maybe you can plug it into google and see what comes up.

[table i compiled of interesting regs for mode set]
var where (mmio+x) value
ram bandwidth: 0x00100000 0 PFB
mem type: 0x00100200 0x08c10110
cfg1: 0x00100204 0
mem width: 0x00101000 0x803fc447 PEXTDEV
offset0: 0x00400640 0 PGRAPH
offset1: 0x00400644 0
offset2: 0x00400648 0
offset3: 0x0040064C 0
pitch0: 0x00400670 0
pitch1: 0x00400674 0
pitch2: 0x00400678 0
pitch3: 0x0040067C 0
cursor2: 0x00680300 0x01800200 PRAMDAC
pll: 0x00680500 0x0001fb09
pll: 0x00680504 0x00003302
vpll: 0x00680508 0x0002d308
pllsel: 0x0068050C 0x10000700
general: 0x00680600 0x00100100


[load/unload hw state]
VGA_WR08(0x03D4, 0x19);
VGA_WR08(0x03D5, state->repaint0);
VGA_WR08(0x03D4, 0x1A);
VGA_WR08(0x03D5, state->repaint1);
VGA_WR08(0x03D4, 0x25);
VGA_WR08(0x03D5, state->screen);
VGA_WR08(0x03D4, 0x28);
VGA_WR08(0x03D5, state->pixel);
VGA_WR08(0x03D4, 0x2D);
VGA_WR08(0x03D5, state->horiz);
VGA_WR08(0x03D4, 0x1B);
VGA_WR08(0x03D5, state->arbitration0);
VGA_WR08(0x03D4, 0x20);
VGA_WR08(0x03D5, state->arbitration1);
VGA_WR08(0x03D4, 0x30);
VGA_WR08(0x03D5, state->cursor0);
VGA_WR08(0x03D4, 0x31);
VGA_WR08(0x03D5, state->cursor1);
chip->PRAMDAC[0x00000300/4] = state->cursor2;
chip->PRAMDAC[0x00000508/4] = state->vpll;
chip->PRAMDAC[0x0000050C/4] = state->pllsel;
chip->PRAMDAC[0x00000600/4] = state->general;


chip->CURSOR = &(chip->PRAMIN[0x00010000/4 - 0x0800/4]);
chip->CURSORPOS = &(chip->PRAMDAC[0x0300/4]);
chip->VBLANKENABLE = &(chip->PCRTC[0x0140/4]);
chip->VBLANK = &(chip->PCRTC[0x0100/4]);


state->cursor1 = VGA_RD08(0x03D5);
state->cursor2 = chip->PRAMDAC[0x00000300/4];
state->vpll = chip->PRAMDAC[0x00000508/4];
state->pllsel = chip->PRAMDAC[0x0000050C/4];
state->general = chip->PRAMDAC[0x00000600/4];
state->config = chip->PFB[0x00000200/4];
state->offset0 = chip->PGRAPH[0x00000640/4];
state->offset1 = chip->PGRAPH[0x00000644/4];
state->offset2 = chip->PGRAPH[0x00000648/4];
state->offset3 = chip->PGRAPH[0x0000064C/4];
state->pitch0 = chip->PGRAPH[0x00000670/4];
state->pitch1 = chip->PGRAPH[0x00000674/4];
state->pitch2 = chip->PGRAPH[0x00000678/4];



[page flipping routine]

int offset = start >> 2;
int pan = (start & 3) << 1;
unsigned char tmp;

/*
* Unlock extended registers.
*/
chip->LockUnlock(chip, 0);
/*
* Set start address.
*/
VGA_WR08(0x3D4, 0x0D); VGA_WR08(0x3D5, offset);
offset >>= 8;
VGA_WR08(0x3D4, 0x0C); VGA_WR08(0x3D5, offset);
offset >>= 8;
VGA_WR08(0x3D4, 0x19); tmp = VGA_RD08(0x3D5);
VGA_WR08(0x3D5, (offset & 0x01F) | (tmp & ~0x1F));
VGA_WR08(0x3D4, 0x2D); tmp = VGA_RD08(0x3D5);
VGA_WR08(0x3D5, (offset & 0x60) | (tmp & ~0x60));
/*
* 4 pixel pan register.
*/
offset = VGA_RD08(chip->IO + 0x0A);
VGA_WR08(0x3C0, 0x13);
VGA_WR08(0x3C0, pan);
}

Share this post


Link to post
Share on other sites
Thank you very much for your reply. The information you have supplied is nearly exactly what I am after, do you have any more info? Where did you get this from.
did you use these values as offsets from the returned Start Memory Addresses values from the card?

Is there any more info for each of the registers you have listed.
I am now about to lock up my PC.

Share this post


Link to post
Share on other sites
Yes, those addresses are offsets in the card''s memory-mapped register file (base address is the smaller memory range indicated in PCI config space). The code accesses them as 0x??/4 because the array elements are declared as 32 bits.

It''s been years, and my references have evaporated (one HD loss too many? :/). Anyway, I think the Nvidia-specific code was from riva_hw.c (should be easy to find). Also try XFree86.
I was surprised how much runs over the old port I/O interface. Ralf Brown''s port list and the VGA Guide helped there.

I would love to see register docs as well (at least the mode set stuff; don''t have to expose 3d FIFOs), but all NVidia released was this one piece of code for Riva TNT and Geforce 4-5 years back or so. Oh well, better than nothing.

> I am now about to lock up my PC.
hehe, ain''t that the truth. I crashed my machine so often, it wasn''t even funny

Share this post


Link to post
Share on other sites
Well after a bit more searching I found the source that you mentioned thank you, (nv4ref.h is one of the usefull ones) and I had a look at some of those address ranges that were obvious and added them to the returned start address from the card, and bingo it was there (I used the start gwhcursor). That is a good one because iy shows you where the Hardware Cursor (mouse) is.

I actually was hoping that the CRTC settings were mapped there somewhere as well, however it doesn't look like they are, but just using the old out and in gets those values alright.

I have been looking at the IO CRTC registers while adjusting values in PowerStrip and I can see them changing - cool.

Now the memory mapped ones, I am a little confused about the PLL settings, for instance I watch the VPLL M N and P values when I change the pixel clock in PowerStrip, which I can see changing but they don't make a lot of sense at the moment. I have programmed PLLs before but I am not sure whether these are divider values or not, eg. at 102.000 MHz they are 0x09 0x44 0x01 and at 108.000MHz they are 0x02 0x10 0x01. The fourth byte doesn't seem to be used it is just 0x00 all the time and in the riva file it seems that that's how it is.

Have you had any experience with the PLL values ? maybee the bytes or nibbles are reverse ordered or inverted ??

[edited by - mp3man2000 on April 22, 2004 7:40:54 PM]

Share this post


Link to post
Share on other sites
> That is a good one because iy shows you where the Hardware Cursor (mouse) is.
Indeed useful. More so if its position is set in the ISR - no DPC needed, cursor shows even if OS is hosed

> I actually was hoping that the CRTC settings were mapped there somewhere as well, however it doesn''t look like they are, but just using the old out and in > gets those values alright.
Yep.

quote:
Now the memory mapped ones, I am a little confused about the PLL settings, for instance I watch the VPLL M N and P values when I change the pixel clock in PowerStrip, which I can see changing but they don''t make a lot of sense at the moment. I have programmed PLLs before but I am not sure whether these are divider values or not, eg. at 102.000 MHz they are 0x09 0x44 0x01 and at 108.000MHz they are 0x02 0x10 0x01. The fourth byte doesn''t seem to be used it is just 0x00 all the time and in the riva file it seems that that''s how it is.

My experience with programming the PLL was a video mode that sort of ''swam'' on the monitor But looking at riva_hw.c, we can figure it out: chip->CrystalFreqKHz = (chip->PEXTDEV[0x00000000/4] & 0x00000020) ? 14318 : 13500; I think 13.5 MHz is what we have. MClk = (N * chip->CrystalFreqKHz / M) >> P; for m = 9, n = 68, we get 102 / 2. m = 2, n = 16 yields 108 / 2. P messes things up, it''s shifted down 1 bit. Maybe the card wants it that way?

Interesting stuff..

Share this post


Link to post
Share on other sites
OK, I was confused when you first put up the math for calculating the pixel clock. That last shift was confusing.

I looked up the offending code and worked out what you meant. After thinking about it for a while and doing some more calculations, it hit me that the clock source must me 27MHz, this makes that formula work ok. It actually makes sense because the card I am talking to is a GeForce4 Ti, which probably means that they have standardised the clock frequency to what it probably should be. 27MHz clocks have been used in professional video equipment for years, because the correct horizontal frequencies can be derived from it.

I am still a bit confused by where you read the clock frequency from (what is the offset?)....

Also the vertical retrace bit doesn't seem to be implemented any more, I have tried enabling and disabling it, and indeed when I do OpenGL buffer swaps with it enabled it does limit the framerate to the refresh rate, but I need to be able to poll something so I know when they are. Unfortunately starting an OpenGL instance and waiting for the buffer swap function to return is not an elegant way of doing it especially when I won't actually be drawing any video.

Maybee I have to read the current drawing position in the video buffer (is that even possible?)..

Thanks again for your time in replying to these posts, it is much more fun bouncing ideas off someone else, and by the way I havn't locked the PC up again yet.

If you were wondering, I am actually doing all this under WINXP, which has made reading and writing to physical memory and IO ports interesting. So far I have found the best way has been to use the TVicHW32 library, it has been fantastic and windows hasn't complained once....

[edited by - mp3man2000 on April 22, 2004 11:33:09 PM]

Share this post


Link to post
Share on other sites
Yeah. Interesting about the 27 MHz; I don''t know either why everything is divided by 2.

EXTDEV region offset: googling PEXTDEV yields "nv->pextdev = mmio+0x00101000/4" mmio is a ulong*, so 0x00101000 is the byte offset.
BTW, the nvidia.c that came up is very interesting - will have a closer look later.

quote:
Also the vertical retrace bit doesn''t seem to be implemented any more, I have tried enabling and disabling it, and indeed when I do OpenGL buffer swaps with it enabled it does limit the framerate to the refresh rate, but I need to be able to poll something so I know when they are. Unfortunately starting an OpenGL instance and waiting for the buffer swap function to return is not an elegant way of doing it especially when I won''t actually be drawing any video.

You mean the bit in reg 0x3da (IIRC) isn''t toggled anymore on refresh? That''s bad.
Another idea: do you have access to the VESA VBE functions? ah=0x4f, int 0x10. Those are a standardized way of talking to SVGA cards, a reaction to the mess of card-specific code. In particular, there was a triple-buffering function that let you schedule a display flip after the next vsync. Maybe you could use that?


quote:
Maybee I have to read the current drawing position in the video buffer (is that even possible?)..
hmm, interesting idea. Always staying behind that would eliminate flickering and tearing, but I have no idea how to get at the counter / if it''s even possible.

quote:
Thanks again for your time in replying to these posts, it is much more fun bouncing ideas off someone else, and by the way I havn''t locked the PC up again yet.

Glad to. hehe

quote:
If you were wondering, I am actually doing all this under WINXP, which has made reading and writing to physical memory and IO ports interesting. So far I have found the best way has been to use the TVicHW32 library, it has been fantastic and windows hasn''t complained once....

Was about to ask Costs though. Writing a simple WDM driver that does port I/O on behalf of the app is on my todo.

Share this post


Link to post
Share on other sites
I havn't actually made much progress since my last post, well not on what I am supposed to. I have however had fun learning about opengl. I am playing with surface and lighting at the moment, I will have a bit of a play with textures next I think.

Anyway I still havn't had a reply from NVIDIA, I suspect that there is a lot of information that can be extracted from the new cards, considering that the info we are playing with is for the old RIVA card. Anyway I will have a bit more of a look at the OpenGL tutes....

PS. I wrote a quik program that itterates through all the possible M N & P values for the PLL, it removes repeated frequencies, then sorts by frequency order. The results are returned in an include file. I did this because I was wondering what the granularity could be with regards to pll frequencies, I will throw in a quick snippet. This is only the start but you will get the general Idea.


#ifndef DLS_NVPLL_tbl
#define DLS_NVPLL_tbl


typedef struct
{
long int f;
int m;
int n;
int p;
} _DLS_NVPLL_tbl;

_DLS_NVPLL_tbl DLS_NVPLL_tbl[]=
{
{ 45059,252,151, 0}, { 45060,247,148, 0},
{ 45062,242,145, 0}, { 45063,237,142, 0},
{ 45064,232,139, 0}, { 45066,227,136, 0},
{ 45067,222,133, 0}, { 45069,217,130, 0},
{ 45070,212,127, 0}, { 45072,207,124, 0},
{ 45074,202,121, 0}, { 45076,197,118, 0},
{ 45078,192,115, 0}, { 45080,187,112, 0},
{ 45082,182,109, 0}, { 45084,177,106, 0},
{ 45087,172,103, 0}, { 45090,167,100, 0},
{ 45092,162, 97, 0}, { 45095,157, 94, 0},
{ 45098,152, 91, 0}, { 45102,147, 88, 0},
{ 45105,142, 85, 0}, { 45109,137, 82, 0},
{ 45113,132, 79, 0}, { 45118,127, 76, 0},
{ 45120,249,149, 0}, { 45123,122, 73, 0},
{ 45125,239,143, 0}, { 45128,117, 70, 0},
{ 45131,229,137, 0}, { 45134,112, 67, 0},
{ 45137,219,131, 0}, { 45140,107, 64, 0},
{ 45144,209,125, 0}, { 45147,102, 61, 0},
{ 45151,199,119, 0}, { 45155, 97, 58, 0},
{ 45159,189,113, 0}, { 45163, 92, 55, 0},
{ 45168,179,107, 0}, { 45173, 87, 52, 0},
{ 45178,169,101, 0}, { 45180,251,150, 0},
{ 45183, 82, 49, 0}, { 45187,241,144, 0},
{ 45189,159, 95, 0}, { 45191,236,141, 0},
{ 45195, 77, 46, 0}, { 45200,226,135, 0},
{ 45202,149, 89, 0}, { 45204,221,132, 0},
{ 45209, 72, 43, 0}, { 45214,211,126, 0},

etc etc.............

[edited by - mp3man2000 on April 24, 2004 4:27:26 PM]

[edited by - mp3man2000 on April 25, 2004 3:31:29 AM]

Share this post


Link to post
Share on other sites
> Anyway I still havn''t had a reply from NVIDIA
hehehe.. forget it

quote:
PS. I wrote a quik program that itterates through all the possible M N & P values for the PLL, it removes repeated frequencies, then sorts by frequency order. The results are returned in an include file. I did this because I was wondering what the granularity could be with regards to pll frequencies, I will throw in a quick snippet.
Interesting. Were you able to program all of those? I recall the VBE return-nearest-pclk routine having a much higher granularity than those 3..5 KHz. Wonder why; maybe a limitation on the BIOS end? It may have been table-driven instead trying out all fractions

Share this post


Link to post
Share on other sites
I havn't actually tried to send values to the PLL yet. I will give it a go and see what happens (PC with no video probably) luckily because I use an lcd monitor, even the slightest change in frequency can be noticed. However that can be detrimental as well if you end up changing the frequency substantially the picture can go away completely.

I will let you know what happens (maybee I won't be able to see to type).

Oh yes there was one other thing I wished to ask. Do you know where the memory positions are for the second head, that is what I actually have to control in the end, not the primary. Although I can probably use the primary buffer status to determine frame decodes, I need to get the vertical interval from the TV output.

[edited by - mp3man2000 on April 25, 2004 8:55:51 PM]

Share this post


Link to post
Share on other sites
Well it is all starting to seem too hard. I have managed to send the PLL values and it changes frequency, but I found that I have to check each byte after it has been sent and resend it if it is different (which happens a lot). This seems very ugly to me and would appear like I am not the only one sending values. Anyway I have locked up the PC a bit, but persisting.

Share this post


Link to post
Share on other sites
Aye. Probably much more work than it''s worth.

hmm, after resending, is the PLL value correct? Is it a problem with granularity? If not, and something else is writing to it at the same time, not good at all

Share this post


Link to post
Share on other sites
I only can give a little help with nVidia dualhead behaviour.

We will use 0x3d4/0x3d5 out writes.
To get access to CRTC2 registers you should:
1. Unlock CRTC registers writing 0x57 to 0x1f
2. Move CRTC registers ownership to second head writing 0x03 to 0x44
3. Do some stuff
4. Lock CRTC regiosters writing 0x99 to 0x1f

To return ownership to first head you should write 0x00 to 0x44.
(with unlocking - locking again).
It works for me.

Please look closer to
http://www.textsure.net/~ela/download/rivatv/rivatv/src/v4l-riva.c
to find details.

WBR, Alexey

[edited by - aktinkon on May 18, 2004 3:35:26 AM]

Share this post


Link to post
Share on other sites
Thank you for the dual head information. I will try switching as you suggest. The project has been halted at the moment because the company that supply the codec seem to have solved the problem by using windows media player for displaying the video. However I am still interested in learning about the graphics card programming anyway.

I have also been sidetracked learning OpenGL, which I have found most rewarding, and I am trying to get the SMPEG library to compile on win32 using the Borland compiler, I have managed to compile the plaympeg test program using the dll that someone else has built, however I would like to compile a library with mpeg audio playback only, I don’t need video playback.

Anyway this little project has now sidetracked me into writing a game, which is fun, in fact I am rewriting the classic game lunar lander, well maybe it isn’t classic but when it came out I loved it.

If anyone is interested I could post what I have done so far and you could have a look.
Thanks everyone for your input.

I will post it anyway, I am curious for some responses.

Share this post


Link to post
Share on other sites
OK after having a look and realising that I cannot upload to this server I have created a quick and dirty web page so that you can download the game, it isn't big so it is an easy download. Anyway have a look, and tell me what you think.






Here is the game



[edited by - mp3man2000 on May 22, 2004 9:08:55 PM]

Share this post


Link to post
Share on other sites