Best way to deploy host-based IPS signatures without causing disruption

I'm not so concerned with network-based IPS as an entire subnet can
sometimes only have a pair & a signature that disrupts service can be
rolled back rapidly.

However, I have host-based IPS signatures, namely Trendmicro in the
thousands to be deployed into each of the Production VMs & the principal
can't guarantee it won't break the apps or disrupt services.

The signatures have different severity ratings from Critical with Exploits,
Critical with Smart, High, Med, Low so we're dividing them up by different
OSes but still it's a lot.

a) do people generally take a snapshot backups first?  Must say I don't
    have much storage to take snapshots & those snapshots had caused
    datastore to fill up in the past, causing painful disruptions

b) is there a quick way of fallback or rollback in the event we are affected?

c) is it more commonly practised that people deploy more signatures on
    Web servers (ie those facing external or Untrusted zones) & less on,
    say the backend (eg: DB servers) ?

The signatures & the agent sit inside each individual VMs unlike
network-based IPS where network traffic are re-routed to the network
IPS before being passed onto the servers

I'll be checking with Trend as well
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

David Johnson, CD, MVPOwnerCommented:
a) do people generally take a snapshot backups first?  Must say I don't
    have much storage to take snapshots & those snapshots had caused
    datastore to fill up in the past, causing painful disruptions
b) is there a quick way of fallback or rollback in the event we are affected?

Your first pressing problem is storage. Increase your storage pool. I would guess that backups are not such a priority due to (a) lack of storage space. I don't think that separating out IPS signatures is a good idea.. getting hit with a low priority intrusion is just as bad as being hit by a high security zero day intrusion.  

Does your backup policy follow the 3-2-1 routine?  3 backups, 2 different types of media, 1 offsite ?   Do you keep a weeks worth of daily's, a months work of weekly's, a years full of monthlies, and a collection of years? If asked to restore bare metal to a particular date and time could you do it? how long would you be offline? Have you tested your backups to ensure that they work?

Any server can be a host to the intrusions.. Database's are a high priority target.
btanExec ConsultantCommented:
for signature, wouldnt it be stating malicious intent as the signature is specifically to detect such content and if really there can be tendency of false positive then the transparent or the signature should be granular such that action specific to those new one can be in Alert state rather than Block state. I am not sure if TC can do that but some WAF appliance can stage rule and signature before block action take place. if the signature is for IPS or AV like, it is better to err on safe side.

application break can only be ascertain in staging and eventually production, there is no 100% warranty by any principal or solution but it should be kep to minimal and is good to know why it break as most of the time is legacy apps required that certain action which TC may deems as positive and probably application exception or whitelisting need to be enforced then..

snapshot is incremental daily and full backup end of the week but varied from practice. nonetheless, in VM snapshot is the last known golden state which is verified to be hardened, clean of misconfig and malicious content. that is just the baseline, the important capture and backup is the data - hence the DB information is something required and cannot be lost unless the risk exposure if known and accepted. you cannot regenerate real time data and data corruption is big issue when dealing with privacy and legacy matters which the company has vouched to safeguard and prevent breach and taint incident...responsibility and accountability is key to ensure data resiliency and privacy

going to TC Deep Security, they have in their practice guide (under " Disaster and Recovery " section) also stated to make sure a regular backup of the Deep Security database is scheduled. And most specially when applying a patch or an upgrade to the software. It is more to ensure  restore or recovery of the database from the same version number as the DS Manager

So in short assuming the signature update is really crashing even if it pass ther internal check as there are real case even for other AV or OS provider, they are not liable as the user has taken the risk knowingly, so they can only advocate best practices when such unforeseen circumstance code is secure and free of bugs..They will always in any upgrade of DS state - please do backup the DS database as it is highly recommended...

other extract to note
Scan Schedule Setting
In addition to scan configurations, there is also an option to set a schedule for all types of scans, including reaItime scan. This can be useful if there is a specific timeframe where you’d like to turn off real-time scanning to improve performance.

- File Server is scheduled to have a backup of all files every day at 2:00-4:00am
- This server will most likely have high activity during this time and whitelisting the 2:00-4:00am timeslot from real-time scan activity would significantly help improve performance for both the backup task and server resource.

- Perform a full manual scan on a server prior to running the actual backup task
- We recommend that weekly scheduled scans are performed on all protected machines.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sunhuxAuthor Commented:
Regional Trend got back to us that they don't recommend more than
350 signatures per VM (though they have about 3700 signatures) as
it will affect the VM's performance (even for a 4 vCPU 16GB RAM VM).

That simplifies things a lot for me, selecting only the critical exploits
to apply.  Odd though is what do we do with the remaining 3450
signatures if one of the vulnerability happen to hit us.

Looks like I have to strengthen it with network-based IPS at the
outer perimeter & can't rely completely on the host (ie VMs)-based
Simple Misconfiguration =Network Vulnerability

In this technical webinar, AlgoSec will present several examples of common misconfigurations; including a basic device change, business application connectivity changes, and data center migrations. Learn best practices to protect your business from attack.

btanExec ConsultantCommented:
indeed defense in depth strategy and divide and conquer since we also do not want to fall into the single pt of failure (as mentioned by the limit of VM processing which can be susceptible to adversaries DoS etc). Whole scheme of approach may be ideal to have the "fight" at the exterior/outer perimeter instead of at the "doorstep" at the target server
sunhuxAuthor Commented:
> can stage rule and signature before block action take place
Yes, Trendmicro's DeepSecurity allows the signatures to be
deployed in 'Detect' mode first & later convert to 'Block' mode.

So I'll have a lot to analyse in Detect mode first before we go
into Block mode
btanExec ConsultantCommented:
that is the challenge for tuning to really have minimal false positive suited to your environment and establish the norm from the anomalous. surely we need to have 'zero' false negative - Trendmicro should already have the known bad detected and probably whitelisting will give you another layer of defense - it always catch up for blacklisting which  we cannot neglect but for long term, this leads to security fatigue as you already understand
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
OS Security

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.