Friday 29 January 2010

Hard driving SQL

We have been working on installing an SAP ERP system for some time now. It went live in the latter part of 2009, and almost immediately we started to get some performance issues. After some discussions, we were advised that we should move a number of components form the SQL server to separate disks.

The server had originally been set-up to the specific instructions of the system integrators, and they had carried out the installation of their software. We had 2 logical drives; the operating system on the C: drive and the rest of the product on drive D:.

Essentially, they now advise that we should put the paging file, tempdb files, and transaction log files all on separate logical drives. This does make sense; with the extra drives, there will be less data being processed at the same time by the same equipment. However, the server we have is an HP Proliant DL380 with space for just 6 drives. As all the slots are full, we can’t physically add any more to the existing device.

However, there is a way around this; HP sell external disk arrays which can be added to an existing server. In our case, we obtained the MSA 20 unit which hold 12 SATA drives and this is connected to an HP 6400 SmartArray controller card. We ordered all of the required equipment back before Christmas, but unfortunately we had a series of problems getting the hardware. The bad weather didn’t help as we are a bit out on a limb, but the various bits were coming from different depots, so weren’t despatched together.

Laste week after all of the equipment had finally turned up, ee set-out to do a test of the process of adding the hardware and this went through fairly well. It toook us about 5 hours as we wanted to double check everything at each stage to make sure it worked; we had not had the chance to do something like this before and wanted to be certain it would work. We made notes of the steps and waited for the Sunday so that we could make a start on adding the new hardware to the main system.

The controller card was very easy to add. Pop open the cover, lift out the holder, insert the card and replace the cover. I also connected the cable to the disk array at the same time as I found that easier than trying to fiddle about in the back of the rack trying to make the connection. The slot that the cable uses on the back of the card is quite small and difficult to reach when the server is back in place.

When we fired up the server, it ran through the normal POST routine, and it quickly identified the new Smart Array device. It took a while for the disks to initialise; about 12-15 minutes for them all. However, we then hit our first snag; when it reached the end of the initialisation, it suddenly crashed and re-booted. Funny thing though, when the server restarted, it went back to the initialisation routine and then completed perfectly.

It was then necessary to set-up the logical drives and this is really easy to do. Within the configuration utility, just select the physical drives, the type of RAID and away you go. We chose to put 3 drives at a time in a RAID 5 configuration. It should give the space we need, the protection that it wanted and we get 4 logical drives (12 HDD divided by 3 = 4). With all 4 done, we could then re-boot the server, and see the new drives in the disk manager – we set it to create a new partition on the logical drives and format appropriately.

All of this took us about an hour, perhaps just a bit over. We then moved the paging file and set it to a slightly larger size than before – a quick reboot and still everything was going well. We copied the tmpdb folder to a new drive and then used a SQL script that we had found for dropping it and then re-attaching to the new location. It took literally only a few seconds to do and we were starting to get really cocky. Then it all went wrong.

We stopped the SQL service to copy the transaction log over – all 38 GB of it! We then started the copy process and it took ages. It seemed to copy about 8GB and then it would pause for ages (almost 20 minutes), before then carrying on. We got a point where it had reached around 12-14 GB and the damn server blue screened (one of the few occasions that we have seen Windows Server 2003 do a BSOD).

It turned out to be a paging fault error – once started we modified the paging file to put it back to the same minimum size that it had been, although we left it on the same max size. I restarted the copy process and we waited.. and waited… and waited…. and waited…..

Evetually after about another two and half hours, the copy process finished. We then ran the SQL commands to change the database to point to the new trans log location and once done, we verified that this was correct. We then ran up the ERP to make sure that it worked and it was good. By this time, it was well after 1:00 pm – we quickly finished everything off and locked up, then headed off to a local watering hole for Sunday lunch on the company.

And just to finish the story off, the technician’s wife works at that hotel. Whilst we were eating, she sent a note through from the back room, demanding to know where her dinner was. So a small piece was cut off of the dinner and put on a small plate to be sent out to her – 5 minutes later a message came back demanding to know where the ketchup was!

Wednesday 6 January 2010

New year plans

So the holidays are over and we are all back to work – well almost. Unfortunately, the bad weather has caused some disruption, as a number of staff can’t get into work. Although that hasn’t affected IT staff, we are having to a do few things to help others out. Bet we don’t get any help from them when we need it later in the year!

I like to plan out what work we have to do – preferably at least a few months in advance. As such, I have a list of jobs and priorities against them and this gets updated throughout the year. At the moment, there are a large number of items for the next 3 months and quite a few for the second half of the year.

We are planning to go on a couple of specific training courses, there are some hardware and software upgrades, a couple of events that I feel would be appropriate for myself or my staff to attend and there are a number of jobs that need to be done as part of rolling maintenance programmes. We also have several projects under way and the various steps need to be arranged in the correct sequence and fitted in amongst the other work – plus of course we have the occasional problem that needs to be supported.

Unfortunately, there are several jobs that we cannot yet schedule – we are waiting for information from other people. One of our sites is proving to be a bit too small to handle the work load, so the company are looking at alternative locations. However, the senior managers can’t decide which of the newer sites would be most appropriate, so we can’t yet arrange for any work to be done that is required. Of course we know full well that when they do finally decide, they will expect all of the work to be complete within a few days!

In fact that move is going to be a much bigger task than they anticipate – once the decision is made they will then argue over the layout of the place and almost certainly, will change what they want on a daily basis. We will be cabling up the site for a network ourselves – it saves the company quite a bit of money although it does take up a bit of time. I’ve designed a particular method of network architecture that really works for us, and provides a great deal more flexibility and scalability than the way that these installatins normally get done. Most of the people doing cabling appear to be electrical installers, and they think CAT 5e can be treated like standard 2 core and earth and they seem to have a real problem if you ask for work to be done in a particular way.

On top of that, we have get the telephone lines moved, get an ADSL connection and move all the IT equipment ourselves – the last time we had a move, we also ended moving all the desks and cabinets as well. The staff seemed to think that they could just close down the PCs, put on their coats, pick up their handbags and walk to the new site to find the desks all set up, the PC installed and turned on for them! They were rather upset to find that they were expected to do some of the work themselves!

So January is looking to be quite a busy month, what with one thing and another. Happy new year!