SEP 2009 MAR

Enabled http tunneled streaming (RTMPT) of media from the Flash Media Servers.  This was done by request of the streaming media team to allow better access through firewalls.

Evaluated problems on some of the servers within the shared web environment in an attempt to improve stability and overall service availability.  Increased memory on several of the servers in order to accommodate the growing number of Content Management Systems (CMS) that are being deployed.  Reconfigured system logging do reduce ‘noise’ created by some services to increase the overall usability of the primary system logs.   New watchdog scripts have been put in place to monitor the size of logging files and rotate/compress the logs as needed to reduce storage utilization.  Installed and tuned the OSSEC client on the pair of servers handling the Vanderbilt Student Organization content.  The OSSEC client will be installed on all of the shared web servers once alerts tuning has been completed.   OSSEC is an open-source Host-based Intrusion Detection System (HIDS) with real-time log evaluation which reports to our OSSEC server.

All remaining shared web and MySQL servers have been migrated from the AMD cluster to the newer Intel 7100 cluster within the Hill Center ESX environment.  These were the last servers I am responsible for which needed to be migrated in preparation for decommissioning the AMD cluster.

Took over working with the outsource developers for the new Blair School of Music web site.  The site is being developed in Drupal.  Due to performance issues with the site, a dedicated virtual server has been provisioned and configured for use in this endeavor.  Having a dedicated server for this has been a great aid in determining the root causes for the performance problems.  These issues have been resolved and development is continuing to make progress on the project.  Once development is complete we will evaluate the feasibility of moving the site back within the shared web environment.

Six-month OS andf security patching of linux hosts has begun.  So far patching has been completed for one of the bastion hosts and the backup environment.  This patching is being done by service groupings for easier manageability  and will continue throughout October.

Add comment October 2nd, 2009

August 2009 MAR

Decommissioned the old OSIS virtual machines which completed the project for migrating the OSIS environment to RHEL4 and from the AMD to the Intel ESX clusters.

Nagios has been fully migrated from a virtual machine to a physical machine.  This move was related to performance issues which arose after patching.  The new service has been upgraded to Nagios 3.0.6 and is also utilizing the NDOUtilities for logging to MySQL.   The migration went very smoothly overall.  The service has been stable again since the migration was completed.

Began work on building an additional Streaming Media Server.  The new server will be located on the VUMC network in order to better support their growing needs in streaming presentations, etc.

Groundwork has begun on planning for life-cycle of the JPROD servers.   To more completely mirror production services, the JTEST1 replacement will consist of two servers.   After discussing projected growth and comparing to existing system specifications,  a request for options and quotes has been sent to our vendor.

No work has been done recently to migrate the ND&E “weathermap” to a new server.

DHCP migrations are moving quickly for the DNS project.  Some minor feature changes were requested and completed for the reporting/scheduling page.

At the request of Network Security, the port-block pages have been slightly overhauled and fixed.  It was noticed that the pages were not working properly after being migrated to the new web servers.   So far no other problems have been reported.

Migrated a few smaller databases from mysql01 ro mysql02 in a continuing effort to retire the old MySQL4 server.

There have been no action items assigned in the GMail for Life project.

Add comment August 26th, 2009

July 2009 MAR

Google Search Appliances have been placed into production.  Old ultraseek fqdn has been replaced by a redirection page that collects referring pages which can be viewed through a limited access administrative interface for identification of sites which need to be updated to support the GSAs.  The old Ultraseek servers are currently being decommissioned.

Nagios performance problems remain an issue.  Due to the amount of resources the host is utilizing within the ESX environment, the service will most likely be migrated to a physical host.  Available hardware has been found and the initial build has been initiated.

The AmCom migration from 4.0.63 to 4.5 has been completed.  Fail-over to the new system proved to be a smooth transition.  The remainder of the OSIS environment has now been migrated to the new system.  Decommissioning of the old web/db servers will take place in the near future.

IMAP2 was renamed IMAP-DEV and moved from the production to the ITS-test network in order to test upgrading the cyrus-imapd packages.   Installation has been completed.  Currently testing migration and integration of the new cyrus-imapd build with the Solaris CSW bundled package.

Installed SLAMD client on several web enabled developement hosts for IDM load testing.  Setup firewall access for UDP ping utility which will be used to monitor network health between the individual LDAP servers during the load tests.  Worked with Roland to setup access to the UDP ping service on hosts behind the F5.

Worked to migrate databases from mysql01 to mysql02.   This is an ongoing effort working toward the eventual decommissioning of mysql01.

Add comment July 31st, 2009

June MAR 2009

Google search appliances arrived and were setup.  Public Affairs is working with the appliances now to ensure appropriate spider settings and also customize the user interface.  Failover testing and a more open community test will be done in early July.  Production date is scheduled for the second week of July.

AmCom came on-site to perform the replacement OSIS server installations.  The vendor decided that they would prefer a clean install versus upgrading he current systems.  Using their software specifications, 4 new servers were built and turned over to the vendor.  Upgrades went smoothly once a few minor software dependency problems were resolved.  The vendor continues to work on client-side application upgrades.  Production is planned for the middle of July.

The web pages for supporting IP/DNS requests for subnets which have been migrated to the DiamondIP appliances have been re-written and are ready again available for use.  The perl API supplied by the vendor stopped functioning after a recent update.  The vendor was disinclined to support the perl module which they supplied because it is not part of their standard support package, so a new solution was found using their CLI API which is supported.   The pages which utilized the perl API have now been rewritten in PHP and should be supported by future updates.

LISTSERV began to exhibit performance problems.  Worked with the email team and the vendor to troubleshoot the issue.  Neither the vendor nor ourselves were unable to identify a definite root cause of the performance problems.  The vendor suggested that we try using their High Performance Option (HPO).   After obtaining a trial key and testing to evaluate the performance differences  it was decided to place the key on our production host.   At the same time a new PHP login page was activated to replace the previous perl CGI script, which will give more feedback to users.

Work and testing continues on the SMTP replacements.  So far we have had successful tests of inbound, outbound, and vacation routing and delivery.  We hope to be ready for a production SMTP replacement very soon.

Still no tasks for me from the GMail for Life project.  IDEV are continuing to work on their solution.

PBX Pager project will have some tasks for me soon as the iDEV group draw closer to being ready for production.  Testing on JTEST1 continues and is going well.

The Nagios replacement has been put on hold due to higher priority projects.  There are still some issues with getting the logging agent to correctly connect and log to mySQL.

Add comment June 29th, 2009

MAY MAR

Completed migrations of Nagios, Owl, and VUNetID from the AMD Cluster to the Intel 7100 Cluster.  Nagios was moved earlier than planned due to problems within the AMD Cluster which we were never able to explain.  This completes the migrations I needed to get done before June in preparation of decommissioning the cluster.

The AMP2 memory upgrade was completed.  Verified that the system was properly addressing the additional 8GB of memory both in the BIOS and the OS after the upgrade.  Critical firmware patching was completed at this time as well.

Worked with Jeff and Gary to troubleshoot problems with SMTP and IMAP servers after one of the servers were removed from service and the user accounts were migrated to other hosts.   Most of the issues with the SMTP servers and VUWebmail were resolved quickly.  Peter has continued working with Jeff to work through minor problems with some user accounts in the IMAP environment.

The proposal was finally submitted for the search engine replacement.  The current solution’s licensing expires the end of June.  Given the time constraint I looked into an at least temporary solution using the Google Search API.   This was presented to members of ITS and Public Affairs and was deemed a good idea but was not a robust enough solution.  After doing some other research I started testing an open source solution called ‘Sphider’ which is php/mysql driven and has its own web crawler.   This solution was dropped due to concerns of using open source, though it would have been a nearly 1-to-1 replacement to our current search engine.

Judy was assigned the task of decommissioning the 2 Kiosks located in the hallways in the Hill Center.  She requested help in doing so since I had done the original installation and setup of the Kiosks.  I told her I would take care of it.  She said to let her know if there was anything she could assist with and asked that the televisions remain mounted on the walls with CNN being shown.   I enlisted the help of Warren and after removing the NEC Bluefire appliances it was discovered that the televisions did not have built-in tuners.  We located 2 unused VCRs in the building and connected them to the televisions after obtaining permission to use them.

Continued meeting for the GMAIL for Life project.  IDEV is still working on their parts of the project.  No new activities have been assigned for me as of yet.

It was discovered that the OnSite::API perl module provided by BT Diamond IP is no longer working properly, most likely due to recent patches.  After more detailed troubleshooting it was shown that the API is in fact working, but is returning an NULL array for the specific queries needed for integration into our website.  A ticket has been opened with the vendor concerning this on May 21.   We are still awaiting a response from the vendor on the ticket.

Add comment May 29th, 2009

APR MAR

Memory upgrade requested by ND&E after recommendations from their software vendor for AMP2 has finally arrived.  Due to the change moritorium during finals and graduation this will not be completed until mid May.

Worked with the vendor for the VEHS Radiation database.  A working solution has been completed after nearly two weeks of trial-and-error and as well as 5 updated versions of the client software provided as problems were found.   The client is successfully allowing admin password updates with a hashed/encrypted password, SSL connectivity to the MySQL database, and SSL connection and authentication against our Active Directory servers.  Client installation package has been created and passed along to Joanne in Desktop Support to install on VEHS workstations.  Joanne has provided a list of workstations which will be running the client and needing MySQL connectivity.   Currently awaiting the go-ahead from Peter on the new MySQL 5 server for final deployment.

Began researching installation needs for replacing our current Nagios server.  The replacement will be running Nagios v3 and will utilize a MySQL database for logging verses the standard flatfiles.  SNMP Trap handling will also be incorporated for better monitoring of services such as the Cisco CSM load balancer.

Worked with the email and security teams in identifying webmail accounts which were found to have been compromised.

Finalized configuration of a new flash server located on the VUMC network and worked with VUMC Network Security to get appropriate access to the new server.  This was deployed to handle streaming load on the VUMC network for their graduation ceremonies.  ITS Streaming Services are in the process of testing the new server.

There are no updates on the Search Engine Replacement project.  The SDM reports that the report is being finalized to be put forth for financial approval and governance.  The contract for current Ultraseek Service will expire in the next 60 days.

Completed a couple of requests from iDev on the ‘GMail For Life’ project.  No additional tasks have been presented to me while iDev works toward their intended solution.

Completed installation and initial configuration of the new HypericHQ server for testing.  Requested basic firewall access from Network Security and setup accounts for the NOC Manager to be able to begin testing.

Began working on presenting the GNAV database for VUMC and MIS for data mining and reporting.  In the process it was discovered that the vendor provided backup script appeared to have been working but was not properly performing backup operations.   Instead of attempting to fix the exxisting script a new one was written.   Backups are now compressed which allows for retention beyond a single daily backup.  Replication of backups to the secondary server has also been incorporated into the script and tested.  Any failures encountered by the script should trigger a notification email to the Unix Team.

Due to incompatabilities with archived Windows Media files, ITS Streaming Services will no longer be pushing for upgrading the Helix servers to v12.  Per request, all previous revisions of v11.x have been installed on the test server.  Installations have been made and access has been given to allow easily changing the running version on the server for testing.  ITS Streaming hopes to find a preferred previous version with the highest number of working features and least amount of bugs to move the production servers to at a later date.

Add comment April 27th, 2009

MAR 2009 MAR

Corrected missing/unconfigured/out-of-date vmware tools on assigned systems.

Started scheduling and performing needed OS and application patching of systems.

Rolled out new mailing list server.  Handled operational/post-launch problems as they were found.  Decommissioned old LIST-SRV1 host.

Rolled out new Cacti monitoring server.  New server is configured for using a monitoring daemon instead of a cron job which handles our environment much better.  Noc-apps has been decommissioned and re-tasked to testing/evaluating a new monitoring solution: HypericHQ.

Wrote a script package which allows for much easier deployment of the ossec monitoring agent onto linux hosts in the enviroment.  Demonstrated use and logistics of the script and it’s functions to the unix team.

Found vlan problem which was causing sitemason frontends to lose connectivity to the NFS share when bonded to the problem interface on one esx server.  Worked with the NOC and ND&E to identify, verify, and correct the problem.  There have been no outages in the sitemason environment since this was found and corrected.

Rebuilt Helix servers on RHEL5 in order to migrate the service from the AMD cluster to the Intel cluster.  Scheduled for completion Mar 29th.

Initiated rebuild of vunetid for migration from AMD cluster to Intel cluster.

Worked with team members on the DNS project to complete the RFC1918/Split-DNS implementation.  Awaiting solution from the vendor to address issues found in the test environment.

Attended inital meeting concerning a new project “GMail for Life” which will be coordinated with iDEV.  Awaiting more details on the project before moving forward.

Attended meetings concerning implementing a database driven application for VEHS.  The SDM is still working with the software vendor and VEHS on details for the project.  Awaiting more information, specifications, and software before being able to move forward.

There has been no activity by myself on the Search Engine replacement project.

Continued working with Storage team on the SMTP replacement project.    Standard SMTP configurations and loaders have been tested and appear to be working as intended.  Vacation implementation is still in progress.

Add comment March 26th, 2009

FEB 2009 MAR

Sitemason stability issues appear to have been finally resolved.  An extra frontend has been added to better handle large surges of traffic.  New watchdog scripts monitoring memory usage by perl and connectivity to the database have made a tremendous difference in the quality of the service.

Bandwidth problems stemming from antivirus updates have been resolved.   Traffic generated by clients downloading large updates were greatly impacting performance.  FTP access has now been limited in connections allowed versus the ‘unlimited’ configuration previously.  HTTP access has been reconfigured to use mod_bw which monitors and limits bandwidth use.  Total bandwidth for HTTP is being limited to 100 Mbps with a minimum connection speed of 200 kbps for clients, roughly the speed of most home DSL connections.  All files over 10 MB are limited to a maximum speed of 300 kbps.

Work has continued on SMTP replacement.  Kenon and Derek have been officially handed the project to complete, though I am still working with them as I am able.

The vuwebmail environment has been patched and updated.  Along with the OS updates, content on the login page has been updated at the request of Public Affairs.  The new login content is dynamic and pulled directly from content they already manage on the main website.  This change allows Public Affairs to indirectly update the content displayed on the login page presented to students using the VUWebmail service.

Listserv is ready for production and is being deployed during spring break.  A new development server has been created which mirrors the final configuration of the new server so that we have a valid test platform for later updates.

Upgraded Flash Media Server to version 3.5 and moved the service to the ‘Streaming media’ network.  The upgrade was intended to provide some fixes and enhancements to the service while the network change was to address some off-campus performance issues.

Add comment March 2nd, 2009

JAN 2009 MAR

Continued work on the Listserv project.  Tools from the current production Majordomo server have been adapted to work with the new service.  Testing is going well.  The replacement should be ready for production soon.

Cacti replacement has been built out.  Configuration of the new server should be able to handle the enterprise environment better than the current implementation.  VM inventory handling scripts need to be migrated from the current server before it is ready for production.

Sendmail configuration appears to have been completed on the smtp replacement.  Work has been done on the loaders but is not yet completed.

Sitemason enhancement has been moving quickly.  Research and changes to the production environment have made little impact on stability so far.  Testing is currently underway for the replacing production with a new load-balanced solution.

A date has finally been set to complete patching of the osis servers.  Patching will be perfomed by the vendor.

Worked with the SDM on the Ultraseek replacement to work on requirements for creating new VMs to replace the existing physical servers in the event that we decide to stay with the current solution.

Worked with NEC to troubleshoot the periodic lockouts experienced by user of GNAV.  No root cause for the outage was found.

Updated version of the Flash Media Server has been released.  Will be coordinating with Streaming Media to perform the upgrade.  Also planning migrate the server to the StreamingMedia network at the same time in an attempt to improve perfomance.

Red Hat training for Febuary has been canceled.

Add comment January 27th, 2009

NOV 2008 MAR

Completed updating Legato Networker on my assigned systems.  Wrote a simple script and made a tarball which adds the apphost group and all unix admins (as needed), updates the public SSH keys, installs rootsh if not already installed, updates the system MOTD, checks for the Altiris agent and installs/upgrades as needed,  and installs/upgrades the Legato client as needed.  Dan has hijacked part of update script and incorporated it into the RHN kickstart/registration script.

Completed security patching and upgrades on the OSIS-WEB/DB01’s.  OS patching completed without errors or complications, but Oracle patching by the vendor took much longer than expected.   Failover and testing afterward appeared to be good.  Upgrades on the second set will proceed after the holiday.

Final patching and bug fixing was done on LISTSERV for the list replacement project.  Only minor changes had to be made to pass security scans before pushing the project into full testing.

Helped with recovery proceedures throughout the environment after an unexpected power outage.   Helped to identify problems which were discovered during DR and work on solutions to correct.

Worked with Streaming Media and Real Networks to further troubleshoot the wmv playback issue we have discovered.  Nothing the vendor has suggested corrected the issue as of yet.  Currently they have directed us to test on a physical server instead of a VM since they do not officially support their products in a virtual environment.

Worked with Aquis and telecom to get the circuit tested and working for the short-haul modem connected to JTEST1.  As of yet the serial link is not showing as active on the host, but manual testing with a laptop verifies that the link is good.  OS configuration appears to be correct.  Need to work with iDev to verify the connection configuration within their application and expand testing from there until the solution is found.

Began working to complete the project to have a physical linux bastion host.  So far the network configuration has been setup and tested for high-availablilty locally.  Package tuning and custom configurations are being completed to mimick the current bastion host implementations.

Add comment November 25th, 2008

Previous Posts


Categories

Links

Feeds