Jump to content
  • Advertisement
moeen k

how should work with non-text files in git

Recommended Posts

hi.

i want to know what happens when i update a sqlite file or unity scene of my project and after that i commit them? does git update them on that commit ? if not what should i behave with these type of files?

Share this post


Link to post
Share on other sites
Advertisement

Git treats all files equally by default, regardless of their size. But its processing model isn't very good for massive files. All operations require RAM linear to the size of the largest file being "operated" on (committed, checked out, diff'd, etc) at any given time.

Among other things, you want to look into Git LFS ("Large File Support"), which lets Git track just the metadata for large files, and defer responsibility for storing them to a second, parallel repository which is optimized for them. You need to opt files in to LFS use, using patterns just like in .gitignore, but otherwise it's quite easy.

Share this post


Link to post
Share on other sites

Your post doesn't say how much you already know about it, or if you understand what the problem is with the files.

Git tracks history by keeping track of differences between file versions. Git was designed around text files, and it works really well for source code. Source code files are small, usually a few kilobytes large, the differences between files are small, and the text difference can be compressed efficiently.

But that isn't the case with all files. Binary files tend to not have good results with file differences, nor do they tend to compress well. By default Git will compress and store the full versions of binary files, rather than only storing differences. This means the files take more space in the repository.  Binary files used by games are also generally much larger than text, and changes in binary files often result in large histories. Over time projects have many different updates and revisions which take space. Binary file differences quickly take enormous space.

Since everyone maintains a copy of the complete version history on their machine, large Git repositories are difficult to work with across teams. Syncing a 3 megabyte repository of source code isn't difficult, but syncing a 200 megabyte repository takes time and are painful even on fast connections. Repositories measured in gigabytes take ages to sync and to process. That's part of the reason why people created "large file support"  (LFS) for Git, so they aren't stored in the distributed repository. That comes with benefits for time and space, but drawbacks when it comes to synchronization and distributed architecture.

Git works well and I've used it on several projects both inside and outside the industry, but it is a bad fit for most games. Git+LFS is a combination that allows games to use the better parts of Git, but tradeoffs for the distributed architecture are a business risks. Most of the games industry uses Perforce, which does a much better job dealing with large files and large repositories.

 

 

Unity has options to save scene files as text format rather than binary format.  This makes the files larger, but allows for text-based difference tools like Git to see very them a small changes so it is more efficient in the long run. If you're using Unity and Git together, learn more about text-based scenes and consider using the format.

An sqlite database file will not have easy-to-use differences, but if the files are small and change infrequently they shouldn't cause a problem in a repository.  If they are large or if they change frequently, the repository will become large and difficult to use.  That will have the same problems as any other binary resource.

Exactly how big the problem is depends on your game. From my past experience, the pain starts around 100-200 megabytes in the repository.  If your repository will be smaller than that even with your few binary files, you're probably fine.

 

Share this post


Link to post
Share on other sites

Everything said above is true, it’s probably worth me mentioning an approach I had in a recent professional project.

i wanted the database structure and certain tables to be stored in git with the source code so I put in some pre-commit hooks to mysqldump some tables and data into the source tree and ‘git add’ them.

it worked very well, hope this helps someone else too!

Share this post


Link to post
Share on other sites

Regarding the Sqlite database files, you could keep in source control SQL scripts to recreate them (presumably dumped with a tool, as already suggested), or a combination of well-behaved text source files: lean and stable DDL-only SQL scripts, aggressively sorted and reformatted CSV or JSON files for the data, and some code to create the binary SQLite databases and populate them.

Images could be similarly "compiled" from text formats like PAM.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!