Link to home
Start Free TrialLog in
Avatar of Robb Hill
Robb HillFlag for United States of America

asked on

Need to properly manage sql changes in GITHUB for an existing db that has never had a repo.

I have an existing database that I have inherited.  This is SQL.


I Would like to begin checking DB changes into code and eventually having the db changes commmitted to code and then executing the change via my pipeline.

But to get started I wanted to determine the best structure to get an initial commit of the DB where it stands now.

How should I break this up.

I will be using GITHUB to checkin.  My pipeline is jenkins.

Please help with best practices keeping in mind this is already in production.
Avatar of David Johnson, CD
David Johnson, CD
Flag of Canada image

since git works on the file level there really isn't a way of doing this other than doing a backup and then committing the backup
Avatar of Robb Hill

ASKER

Well for example.  I can write the scripts....SQL Server will create scripts.  So I am not sure why a SQL backup is the only way.

I would like to have this where litterally my devs are writing continuous integration to the DB.

For example.  

We want to modify a table.

You would have the whole check if exits...alter commmand.


But to play catch up with an existing db how would I script it.


And pertaining to best practices how would I separate these files.


For example...

1)  db schema
2)  stored procs and functions
3)  Views and tables
4)  Data


At some point you would have a base of the db and from there everything can be scripted.
And pertaining to best practices how would I separate these files.
For example...
1)  db schema
2)  stored procs and functions
3)  Views and tables
4)  Data


You can't as they are part of the database (stored in the .mdf file itself)
Any changes to the database should be decided upon with a consensus of all the dev's involved and the database altered by one dbadmin only. Then all of the dev's would have to get the new base otherwise things will break.
That does not follow CI/CD for a database.  

Have you never maintained a database build script?
Thats getting there.  Visual studio will do this for you.

I will try that and see if it gives me what I want.

I am looking for a more generic standard versus how Visual studio does it for you.

That also I dont think works for how to take an existing db .build a baseline of scripts and then go from there.
To expand a bit on what David Johnson mentioned.

Your question is a bit unclear, as to whether you're trying to backup...

1) your database creation script (table schemas)

2)  database contents

You can do either + GitHub is generally used to backup #1, which will be a short script to create all your table schemas.

For backing up your entire database, better to make a full backup or incremental backup file, compress it with xz or zstd, then save the file off on external storage.

3) If I were going to use GitHub to store database contents, I'd likely dump each table into a file + store a file set with a version tag, so any snapshot of the database could be pulled out at once.

And... using tar +  xz/zstd is far better for making backups.
Let me give a different scenario and see if this makes sense to you both.  First this is not about backing up.   We backup SQL quite often.

So I have another project where I do the following.

On my local development I am writing a my sql db.  Could be any db.  In this case I pull a docker  container for development .

In my docker file I call a sql cmd.

This sql command litteraly creates the schema.
Any tables.
inserts for static data.
constraints etc.

Due to docker only allowing for one line.  I merge these SQL files in the proper order and run.

The end result is I have a mysql database ready to go.

When I deploy to prod I simply send this sql and it builds the db.

in this case my Prod is Aurora in AWS.


Now because I use Jenkins to build out my pipeline.

Every checkin to my DB in github can kick off a build and update the sql scripts.


So now I have a repo of my DB?
I have every change in the DB ever made.

If I want to add a column a year from now...

I would simply check in the script...something like check if exists....and alter....etc.


Now regardless if you use github or something else.

Have you all never seen or done this.  I have seen this at quite a few companies.


Hopefully this helps.


Now in the case of my original question I have a database that was never version controlled in a REPO.

I would like to begin this due diligence.  But I am not sure best practice and I have never done it on a db already in production.

These scripts would not capture prod data.  The only inserts would be static data.  Like data in a look up table...etc.
This is one in a ci/cd pipeline with moving backpac and dacpac images.  Visual studio data tools can help in make a database pipeline.

Unfornately in an existing db you do have to do the extra work to get the scripts built but once doing so you can have GIT approach at SQL.
ASKER CERTIFIED SOLUTION
Avatar of Robb Hill
Robb Hill
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial