2016 Dashboard Prototype
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
View only
 
 
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAH
1
The service disruption dashboard has been moved to Box as of 01/26/2017, and this sheet has been made read-only.Box version link
2
DateTimeDate of ResumptionTime of ResumptionImpacted ServicesDepartments / Customers ImpactedOutage DescriptionOutage TypeSeveritySeverity DescriptionSvc LocationCategoryLast Update ByRoot Cause AnalysisRoot Cause Analysis CorrectionIncident ReportComments
3
1/25/20176:20 AM1/25/20178:23 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
4
1/23/20175:45 PM1/24/20178:30 AMTravel ApplicationBFS/CampusApplication crashes on entryunplannedMajorApplication unavailableon-premSEALGWestApplication bug that does not release database connections when transaction ends. Applynig UpTime monitoring to the application, hammered it to the point of crashing the application. Normal customer use does not trigger the error.Modify code to handle resource release when process completes. This will require an update to the existing .NET code environment.
5
1/20/20177:42 AM1/20/20177:53 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
6
1/20/20178:40 AM1/20/20179:14 AM
Networking at Webb Hall (Bldg 526)
Earth Science (Geology)Rain water entered UPS supporting building switch and shut down.unplannedMajorLoss of networking in Webb Hall.on-premNCSKPSRain water entered building and sought out the UPS.Unplug UPS and disconnect batteries. Power building switch from commercial power. Identify and mitigate water entry point with Facilities and/or others as needed.
7
1/20/20171:23 AM1/20/20175:00 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
8
1/19/20176:10 PM1/19/20176:20 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
9
1/18/20178:21 AM1/18/20179:11 AM
Networking at West Campus (Devereux)
Housing, ECE and MRL at DevereuxNetwork connection to Devereux failed due to outage of VSDL modem.unplannedMajorAll networking to Devereux failed.on-premNCSKPSFailed VDSL modem link between 7925 and 7961.Replaced VDSL modem.
10
1/17/20178:12 AM1/17/20178:34 AMTelephone serviceGaucho Support CenterModem used for the COX Metro Ethernet network stopped bridging and the NEC UG50 lost network connectivity to the PBXunplannedMajorTelephones were not operableon-prem / COXNCSCIRCOX modem failureCOX sent reset to modem
11
1/17/20179:00 AM1/17/20171:20 PMGoogle DriveCampusSome customer experiencing access problems to Google Drive and KeepunplannedMajorUsers may be receiving errors when accessing drive. Classroom users may also be receving errors when submitting or viewing assignments. Google Keep users may be affected as well.cloudINFRKGGGoogle OutageGoogle Engineers restored serviceREPORT
12
1/13/20171:36 PM1/13/20171:40 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
13
1/13/201710:44 AM1/13/201710:56 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
14
1/12/20176:15 PM1/12/20176:21 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
15
1/12/20172:50 PM1/12/20173:00 PMNHDC internal mgmt networkETS @ NHDCloss on management network access at NHDCunplannedMinorunable to reach managed devices w/in data centeron-premINFRKGGmisconfiguration of VLAN's during routine switch clean upcorrected error
16
1/12/20177:31 AM1/12/20177:58 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
17
1/10/20172:57 PM1/10/20172:59 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
18
1/10/201712:01 PM1/10/201712:02 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
19
1/9/201710:15 AM1/9/201710:17 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
20
1/9/20179:41 AM1/9/20179:43 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
21
1/9/20179:02 AM1/9/20179:15 AMEspresso ApplicationsCampusServer was unresponsive (RDP connection could not be established)unplannedMajorCustomers were not able to access the Espresso appson-premSEALCMONTECINOOS Server related issueServer restart
22
1/5/20173:00 PM1/6/20178:58 AMCampus VPNVPN users from multiple departmentsThere are a couple of VPN configurations in use. The server certificate for one configuration expired resulting in client connection refusals.unplannedMajorCustomers were unable to access some on-campus services and licensed services.on-premNCSKPSCertificate expired.Issued new certificate.Certificate was issued by local self-signed CA, hence no vendor expiration notice. Local CA will expire in two months, so new VPN server roll-out and migration of existing users is critical.
23
1/3/201710:23 PM1/4/201711:30 AMData WarehouseAny Dept that accesses the Data WarehouseNetwork Connection ErrorunplannedMinorCustomers need to wait prior to accessing the data that is normally available when they arrive to work.on-premSEALD Etzfailed with the following error: "[DBNETLIB][ConnectionRead (recv()).]General network error. Check your network documentation.
I think the machine disconnected from the network for some reason and could not re-establish it's connection within the max time limit to proceed with the SSIS package.
There was also an error message on Waredev4 our production DBMS.
24
12/22/20167:15 AM12/22/20168:15 AMUPS 2 offlineNo impact to running services7:15 AM: There is a very brief outage in power, lasting only a few milliseconds.
UPS 2 reboots and goes offline.

7:45 AM: Lance Heuer arrives and begins to diagnose and troubleshoot UPS 2, taking it through the reboot sequence and getting the unit to operational status.

8:15 AM: UPS 2 goes back online and the NHDC UPS system is fully restored.


unplannedMinoron-premINFRTDHPlanned outage of power by FM in North Hall 3rd floor led to mistaken reset of NHDC power circuits by FM staff.Jim Morrison (FM Dept) to correct labeling of circuit breakers supporting NHDCJim Morrison (FM) did notify us ahead of time about upcoming power maintenance (spoke to Lance) but due to human error by FM staff and poor labeling of circuits NHDC suffered a power outage which triggered a known issue with UPS2 going offline.
25
12/21/20163:14 PM12/22/20169:19 AMSuricata IDS, Bro NSM, p0f, TimeMachineCampus Network SecurityCPU lockupunplannedMajorLost network security related data.on-premNCSETACPU LockupUpgrade Kernel
26
12/20/201612:32 PM12/20/201612:50 PMNetwork connectivity to Bldg 584, Facilities Yard.FacilitiesBuilding switch connection went down.unplannedMinorPrimarily affects desktops and wireless access for low-density office and storage space.on-premNCSKPSUPS failure.Replace UPS.
27
12/9/20165:00 PM12/12/20161:00 PMAlert.ucsb.edu user portalAll non-administrative usersUsers attempting to log into the alert.ucsb.edu portal to view or configure profiles were unable to reach the login pageunplannedMinorOutage did not impact the ability to send or receive alert messages - only the ability to change user profile. Problem was only apparent for users that did not have name cachedhybridOCIOthe problem was related to name resolution problem The problem was related to domain name resolution failure caused by previous virtual storage failureServer hosting alert.ucsb.edu redirect was rebooted to solve the problem
28
12/11/20167:00 PM12/11/20168:15 PMProduction distributed applications including BARC, Tiimekeeping, GOAnywhere, Espresso Apps. See Outage Report (when created) for expanded list of services / servers.BFS/Campusapplications hosted on virtual servers offline for up to 25 minutes until restarted.unplannedCriticalDistriuted applications offline.on-premINFRDBOSSO
esx2 server faulted and powered down, VMware HA migrated to other vhosts. A rare network card firmware bug resulting in spurious voltage reporting caused the powerdown.VMWare HA brought virtual servers back. esx2 All firmware upgraded on production ESX servers on 20170114.
29
12/8/201610:57 AM12/8/201611:10 AMPeopleSoft, Emulator, Campus Financial ApplicationsBFS/CampusCiber COMS Services reported there was a large-scale network issue at the Century Link data center, which is where our PeopleSoft servers reside.unplannedCriticalNo access to PeopleSoft Financials and the Campus Web ApplicationscloudSEALABUMMERA RCA is being conducted by Ciber.Will update the Root Cause Analysis once the RCA is completed.
30
12/7/20163:30 PM12/7/20163:40 PMAD, DNS and hosts on the 128.111.125.0/24 subnetPrimarily ETS File and Print customersNetwork switch inadvertant power cycleunplannedMajorAccess to file servers affected, most visible was loss of AD and DNS lookup capabilityon-premINFRKGGstaff relocating equipment from deprecated rack inadvertanly power cycled network switchrepower switch, servers affected have moved to standardized racks in NHDC, so this incidient was predicated by the cleanup of the old rack.
31
12/7/201612:41 PM12/7/2016Problems accessing Box reported to ETS, relayed to CSF, Box Status reports: Investigating - Some users may be experiencing issues while accessing box.com. Our engineering team is currently investigating.CampusIntermittent ability to access files on box.netunplannedMajorAccess to files affectedcloudINFRKGGUnknownCloud provider restablished access
32
12/1/201611:52 AM12/1/20163:42 PMTelephoneThe ClubASA firewall detected error on interface and performed a failover to secondary ASA which was unable to restore all connectionsunplannedMajorThe Club phones on the UG50 could only place calls to phones on UG50on-premNCSCIRUnder investigationIncreased holdtime on ASA to avoid another failover to the secondary ASA. Cisco TAC case opened to diagnose the secondary ASA
33
11/30/20167:35 AM11/30/20169:11 AMNetwork connectivity to Arts, Counseling & Psychological Services, Drama/Dance, Events Center, Faculty Club, Hatlen Theatre, HSSB, Student Resource Building, UCenCampusFailure of 10Gb line card in 515b-c core router (HSSB) blocking access to all downstream equipmentunplannedCriticalTotal loss of connectivity to named locationson-premNCSKPSLine card failed.Replaced line card with spare.
34
11/23/20165:33 PM11/23/20165:39 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
35
11/23/20167:57 AM11/23/20168:04 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
36
11/21/20169:25 AM11/21/20169:27 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
37
11/18/20167:58 AM11/18/20168:24 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
38
11/16/20166:00 PMServiceNow: etsc.ucsb.edu ETS Customers using Catalog and Campus requests for Messaging and Collaboration (Connect and Zoom).Routing from etsc.ucsb.edu to ucsb.servicenow.com/ navigation portal is not working.  Same is applicable to other sites, e.g. ETS sharepoint site.unplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.CloudATSMACN/A as of this writeupN/A as of this writeup
39
11/8/20165:00 AM11/8/201612:15 PMProduction distributed applications including BARC, Tiimekeeping, GOAnywhere, Espresso Apps. See Outage Report (when created) for expanded list of services / servers.CampusEMC storage array kernel panic resulting in PROD02 filesystem hosting production virtual machines going offline.unplannedCriticalDistriuted applications offline.on-premINFRKGGEMC VNX Storage array share PROD02 dismounted causing VM's running off the share to fail. This in turn disabled applications. Problem traced to VNX firmware bug with data migration. Data migration now disabled. Firmware updates scheduled for Sat 11/12 will rectify bug, which will allow us to renable data migration.REPORT
40
11/1/201612:00 PM11/2/201612:20 PM
ScreenConnect Remote Access
ETSUpdate to latest stable 5.x versionplannedMinorUpgrade for some bug fixeson-premINFRTDHOW
41
10/31/20168:33 AM10/31/20168:41 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.on-premSEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
42
10/21/20163:10 AMExternal Services using DynDns as DNS provider: Box, HR Jobs site (PeopleAdmin), etcCampusDist Denial of Service against DNS provider blocking or impeding access to cloud resourcesunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.cloudINFRKGGDist Denial of ServiceREPORT
43
10/18/201612:20 PM10/18/20161:07 PMMajority of campus networkingCampusDisruption of network connectivity, intermittent network access.unplannedCriticalSevere impact to campus connectivity.NCSKPSA department was conducting maintenance on a bridging firewall in NHDC and accidentally created a loop with their standby firewall.Identified the source via logs and analysis of core-router control plane traffic. Shut down source network and disabled their VLAN. Department disabled standby firewall and service was restored.
44
10/16/20164:10 AM10/17/201611:45 AMApplication that make Broker / RPC calls to mainframe enviroment, typically BARC, PORS and other legacy espresso applicationsBFS/Campusmainframe adabas (DBMS) backup hung impacting remote access to DBMS servicesunplannedMajorBroker / RPC calls to the mainframe would fail as broker and natural servers offline for backup processing. Adabas remained operational, so limited functionally was available for jobs wholly contained on the mainframe.INFRKGGbackup job unable to obtain vtape allocations despite there being over 2K scratch volumes available. Altering was suspended during backup job, which hung waiting on tape volume allocation.backup job was canceled, broker and natural servers restarted, BARC and PORS function confirmed. Two courses of future action: (1) determine why allocation of scratch tapes fails, (2) deterimine if we can alert on scheduled job taking too long esp during scheduled maint window.REPORT
45
10/17/20165:00 AM10/17/201611:50 AMBARCBFS/CampusBARC was having problem starting up. It crashed when it was trying to load in security information through the Shim.unplannedMajorMajor functionality is impacted, but general operations is still available.  No one can logon to BARC, but web interface is still working.  This means that student interface into BARC is not impacted.SEALKCHOINatural on mainframe is having problem and that affects all apps (production or development) that need to use the mainframe.REPORT
46
10/16/20161:00 PM10/17/201611:30 AMFlexCard Allocations, PORS and Balance ReportsCampusWhen attempting to login to any of the impacted services, a socket connection error is being dislayed.unplannedMajorCampus is unable to login to FlexCard Allocation, PORS and Breports.SEALABUMMERThe applications were using an older .jar file (2.0.53 compared to the latest version of 2.0.55) that communicates with the Shim.The .jar file was replaced with 2.0.55, the application service was restarted. From there logging into the application was successful.
47
10/15/201610:30 PM10/16/20161:00 PMPeopleSoft, Emulator, Campus Financial ApplicationsBFS/CampusCiber COMS service (Hosting agency) applied a PeopleTools upgrade from 8.53.07 to 8.53.27. In addition patches on the PeopleSoft Web and Application server were appliedplannedMajorPeopleSoft and Campus Financial Web applications are not available during the maintenance window.SEALABUMMERN/AN/A
48
10/14/20167:55 AM10/14/20168:17 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.SEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
49
9/30/20162:00 AM9/30/201678:30 AMConnexed archive video viewingVSaaS Customers
network provider performed a maintenance window. It should have been 15 minutes of potential downtime but left their circuits, thus ours, down for much longer. As soon as we were made aware of the outage we got in contact with and worked with the provider to get circuits back online.
unplannedMinorWould have impacted users trying to retrieve video archives.INFRTDH
50
9/29/20168:00 PM9/29/201610:00 PMBro and Time-machine security monitoringSOCHost system experienced an NFS lock-up.unplannedMajorLogs not collected.NCSKPSSuspected lock-up due to insufficient buffers.Applying updates and increasing buffers to avoid lock-up.
51
9/28/20168:00 AM9/28/201611:00 AMVSaaS - access to customer portalVSaaS CustomersAccess to the customer portal was sporadic, revorded video was not affectedunplannedMinorAccess to recorded video not availableINFRKGG
52
9/17/20168:00 AM9/23/20169:00 PMVSaaS Camera recording on about 31 camerasLibrary, Engineering, VCAD, DCS, ECONCamera video recorder stopped responding and therefor stopped recordingunplannedMajorno video recording for several days on 31 camerasINFRTDHUPS2 testing affected single power supply recording applicances. CVR#3 did not auto restart. Extended outage caused by manual reallocation to appliances that were online.apparent bug in camera reallocation process between CVR.
53
9/19/20169:00 AM9/19/201610:25 AMEZAccess, OLGL, Hyperion or anything that attempted to access a view that was dependent on the genledger DB. CampusIndexes were accidentally dropped and recreated during work hours creating timeouts for apps/customers those accessing views dependent on the genledger DB.unplannedMinorIndexes were accidentally dropped and recreated during work hours creating timeouts for those accessing views dependent on the genledger DB.INFRDETZIndexes were dropped on the main genleder DB tables creating timeouts for those accessing views dependent on those tables.Re-Indexing the main genledger tables resolved the problem.Accidentally dropped indexes on Prod during work hours and had to recreate them in order to resolve the issue.
54
9/17/20168:48 AM9/17/20169:05 AMCampus networking to multiple buildings and NHDC; campus wireless and VPN services.CampusLoss of power to core router in North Hall due to UPS maintenance in NHDC.unplannedCriticalProduction or mission critical systems are down and no work around is available.NCSKPSNHDC UPS work included planned loss-of-power for UPS2 devices. Core router was fed solely by UPS2.Core router fed by dual sources. Working with NHDC on alternative messaging and positive feedback for planned power-outage work.The first attempt to access the facility to migrate power while gear was on in-rack UPS was a failure. May be an issue with key-card communication path with UPS2 down.
55
9/13/20164:05 PM9/13/20164:21 PMTOE,TOF,Flexcard,PORS (all apps running on Tsunamiweb)CampusProblem bringing up the Espresso login page.unplannedCriticalProduction or mission critical systems are down and no work around is available.SEALKCHOIOne of the espresso apps seemed to have used up all the CPU time.
56
9/12/20169:09 AM9/14/20168:47 AM100G Border connection to LACampusLoss of external route diversity and capacityunplannedMinorTraffic is using other paths, no link saturation or loss.NCSKPSCENIC believes it's a problem between Cisco and Brocade equipment.Card replacement in Los Angeles scheduled for 9/13 23:00.
57
9/6/20163:04 PM9/6/20163:33 PMCampus Wireless serviceCampusoutage due to IAM services outageunplannedCriticalProduction or mission critical systems are down and no work around is available.NCSKPSWireless service unable to access IAM services IAM services restored
58
9/6/20163:04 PM8/6/20163:30 PMETS Identity services and applications accessing those servicesAllnetwork outage adding VLAN to existing configurationunplannedCriticalProduction or mission critical systems are down and no work around is available.INFRKGGVLAN tagging resulted is network outage, HA configuration did not behave as expected.revert to original configurationOutage were 15:04-15:06 and 15:15 to 15:30
59
9/6/20163:12 PM9/6/20163:45 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.SEALKCHOIPossibly related to the unsuccesful network change of the Identity system that caused outage of the system from 15:04 to 15:06 and from 15:15 to 15:30.
60
8/28/201614208/28/20161845UCSB Alert Management Portal and User PortalAllManagement and User web portal pages were non-responsive for approximately 4 hoursunplannedMajorPolice Dispatch notification procedures would have been impacted had an event occured. Police did notice the outage because they were attempting to conduct training.D. DruryVendor provided not root cause analysis - only announcement of resolutionThe vendor claims that this event was the first time in 3 years that the management portal has become unavailable.
61
8/25/20168:16 AM8/25/20168:28 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.SEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
62
8/24/20168:36 AM8/24/20168:39 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.SEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
63
8/22/20164:35 PMBARC - GMC jobsSSL certificate failure for jobs running on GWPROD data exchange to BARCWEBSERVICESunplannedMajorImpacting financial transaction jobs effecting Student information and the BARC office.TDHTroublshooting now.
64
8/22/20162:00 PM8/22/20162:02 PM
TOE, TOF, Flexcard, Travel (tsunamiweb), TS-SUMSS
BFS, SummerSessvmotion of VM's to ESX5 server as it came onlineunplannedMinorNo reported customer impact, INF staff intercepted migration and reversedINFKGGNew ESX5 VHOST server installed to provide platform for W2K3 segregation. VMWare saw available resources and migrated VM's to ESX5, which did not have correct network services.removed ESX5 from consideration in HA / DRS balancingESX5 was initially excluded from consideration in HA/DRS though configuration work inadvertantly allowed ESX5 to momentarily participate.
65
8/19/201610:15 AM8/19/201611:15 AMBARCBFS/CampusUsers cannot log in to BARC.unplannedMajorMajor functionality is impacted, but general operations is still available.SEALKCHOI/KGGProduction GMC was down.  This breaks the authentication process.A copper port failed in the North Hall data center. It has a affected a few systems including GWDBPROD (GMC) and WAREDBMS2 (hyperion). KGG: we need to discuss GMC role in authen. It is currently deployed as a standard server, rather than as a component of IAM. The mainframe based ALLN01 was had far more robust availablity due to the nature of the mainframe platform.
66
8/19/20168:30 AM8/19/20169:00 AMOnlineGLCampusOnlineGL responding with message "Service Temporarily Unavailable".unplannedMajorApplication is unavailable.SEALGAWAutomatic reboot of a supporting production server after an unattended automated patching process appears to have caused the application to go down.ETS needs to define maintenance windows and coordinate amoung all involved teams.
67
8/17/2016
~7:15 AM
8/17/20169:30 AM 90% of systems rertoredAll customer systemsAll staff desktops, printers, self-check and public computing systems (servers and IT systems were not impacted).PacketFence (Network Access Control) server that controls systems that are allowed to connect to the network. Allows only approved systems to connect while denying access to patron owned systems.) developed an error on the system and effectively deny network access to all systems it controlsunplannedme Major functionality is impacted, but general operations is still available. Work around is available.MACAn error on PacketFence caused the system to deny access to all system monitored. A restart of the PacketFence by T3 (Library ITOps) and DHCP has been assigning IP addresses to all 580+ devices since restart.
68
8/11/201610:34 AM8/11/20161:01 PMAll ACD CustomersCampusACD failure - Calls not able to be routed to Depts through call treesunplannedCriticalAll automated call distribution systems on campus are unable to route calls to agents. Our operator is manually transferring people to pilot numbers.NCSKPSNEC staff implemented configuration changes to the PBX in support of VoIP licensing. These changes had the unintended side-effect of disrupting ACD service.Revert changes and restart GNAV server to restart ACD services.
69
8/11/20169:51 AM8/11/20161:33 PMEnterprise App Hosting VEEAM & TSM backup and restore customersEnt App Hosting CustomersExagrid applicance emergency update necessaryunplannedMinorOther backup processes exist, backups typically not taken during business dayINFKGGupdate applied and exagrid back online
70
8/10/201611:31 AM8/14/201611:02 AM100G Border connection to LACampusLoss of external route diversity and capacityunplannedMinorTraffic is using other paths, no link saturation or loss.NCSKPSOptic problem at CENIC facility in Los AngelesCENIC replaced two optics and cleaned fiber jumpers.
71
8/6/201610:00 PM8/7/20163:30 PMTSM Backup ServiceETS INF TSM server not accepting new connectionsunplannedMinorTSM is one of several systems used to protect servers run by INF. It's primary role is to faciliate rapid file based recoveryINFKGGThe problem was traced to a file system path and permissions issue that occurred during the scheduled space reclamation process. file system permissions updated so the the running process id had access to log volumes.
72
8/4/20167:25 AM8/4/20168:00 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.SEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
73
8/1/201611:20 AM8/2/201610:30 AMCampus VPNVPN usersVery slow throughput on one of two VPN serversunplannedMajorApproximately 50% of VPN users experienced very slow throughput, reducing utility of VPN service and productivity.NCSKPSSuspect high CPU and/or memory leak, but data not available.Reloaded affected system. Added system internals (disk, cpu, etc.) to monitoring system for alerting and performance trending.
74
7/30/201610:00 PM8/1/20163:00 PMTSM Backup ServiceETS INF TSM server not accepting new connectionsunplannedMinorTSM is one of several systems used to protect servers run by INF. It's primary role is to faciliate rapid file based recoveryINFKGGTSM DB off-prem backup target became full, caused cascade effect with system not able to accept incoming connections.TSM server not accepting new connections, appears to be capacity issue. Internal DBMS backup process hung. Stuck process cleared and service restored.
75
7/31/20163:27 AM7/31/20164:52 AMNetwork to Materials Research Lab (Bldg 615)MRL, FacilitiesLoss of network connectivity to building 615, Materials Research LabunplannedMajorMajor loss of functionality, however limited use during off-hours.NCSKPSUPS failed causing loss of power to switch. Switch had both power supplies connected to UPS pending availability of commercial outlet.Migrate one switch power supply to direct commercial power.
76
7/28/20163:46 PM7/28/20164:05 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.SEALKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
77
7/26/20168:15 AM7/26/201612:40 PMFinancial web appsBFS/CampusSSL Certificate expired and are causing apps to not function.unplannedMajorUsers are not able to login to the apps due to expired certificate.INFCMontecinoA new SSL certificate was created and applied to the EXEM serverExpired EXEM CertificateThe infrastructure team worked on getting a new certificate. After the SSL Certificate was installed we discovered that the Cipher SHA2 was not compatible with the WebSphere version 6 running on the Tsunamiweb server. We ultimately firewalled the comunication between Tsunamiweb and EXEM.
78
7/23/201611:30 AM7/23/201612:40 PMBARC, Timekeeping, DWBFS/CampusIntermittent storage switch affecting application performanceunplannedMinorSome applications unavailable for 5-10 minutes, no servers required restartsINFKGGswitch uptime of 840 days, several firmware updates present, switch updated and restarted, switches now in firmware update cycle via NOC. Initial SFP failure led to switch instability
79
7/22/201611:34 PM7/23/20167:45 AMVPNCampusVPN services unavailableunplannedMajorAll VPN access disruptedNCSKPSRoot cause unknown, likely software. VM server locked up, no console, no applicable error logs. HW diags passed.Applied software updates.
80
7/20/201612:19 PM7/20/20162:15 PMCampus TelevideoUCen, Housing, RecCenDisruption of some SD channels to public-area televisions (e.g. UCen Hub)unplannedMinorA subset of SD channels were unavailable. No HD channels affected.NCSKPSThree SD cards locked up in Campus Televideo distribution node.Campus Televideo staff performed remote reset of SD cards in distribution node.
81
7/20/20169:03 AM7/20/201610:15 AMFlexcard Allocation ModuleCampusFlexcard Allocation Users could not loginunplannedMajorUsers are not able to login to the app. This is related to the app not being able to connect to the SQL Server.INFCMontecinoResearching what could have caused the connection drop between SQL Server and WebSphere (network related).Restarting the Application Server-Service (Tsunamiweb) solved the issueSomehow the application server or the SQL Server experienced a network connection drop which in turn affected the application pool that the app uses.
82
7/15/20161:54 PM7/15/20163:33 PMNetwork connection to Gaucho Resource Center (GRC) in IVStudent AffairsHigh traffic loss across radio link to GRC facilityunplannedMinorA few desktops lost connectivityNCSKPSStudent Affairs had a problem with system imaging on another part of the network, resulting in a DHCP request/reply spew. The radios providing the link to GRC were inspecting (processing) this traffic, even though it wasn't relevant to the radios.Consolidate bridged vlans into single bridge to avoid unnecessary DHCP packet processing.
83
7/14/20168:30 PM7/15/20163:03 PMVirtual Center login via SSOInternal ETS Sys and App AdminsA few sys and app admins notified that they were unable to authenticate to VCenter this morning starting around unplannedMinorAbout a dozen Systems and applications admins were unable to connect to virtual center to manage thier VMs.INFTDHMS patch KB3161606 was the issue. This patch changed the cipher used by microsoft to authenticate against our LDAP. This was not coordinated with our LDAP authentication and broke logins to our Vcenter for users authenticating via LDAP.Scott Gilbert has tested the removal of the offending cipher and plans to do the removal in production on Monday morning. For ETS we have removed the MS Patch KB3161606 from automatic installation for this weekend.
84
7/7/201612:08 PM7/7/2016Google MailCampusNo specifics provided, except that Gmail service has already been restored for some users, and we expect a resolution for all users in the near future. Please note this time frame is an estimate and may changeunplannedMinorNo customer issues reportedINFMACN/AN/A
85
7/5/20165:14 PM7/5/20165:24 PMGateway, GMCBFS/CampusReboot due to SEP update. unplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.INFKGGAutomatic update of SEP was not implemented of several SQL servers. In the process of manually staging for 1:15am this evening servers inadvertantly restarted.Delay updates until issue with scheduler is corrected, or schedule into Thursday maintenance windows
86
7/5/20164:25 PM7/5/20164:35 PMTOE, TOF, Flexcard, TravelBFS/CampusReboot due to SEP update. unplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.INFKGGAutomatic update of SEP was not implemented of several SQL servers. In the process of manually staging for 1:15am this evening servers inadvertantly restarted.Delay updates until issue with scheduler is corrected, or schedule into Thursday maintenance windows
87
7/3/20167:00 PM7/3/201611:00 PMAll Services Hosted on Windows Servers managed by ETS InfrastructureCampusEmergency update to Symantec Endpoint ProtectionplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.INFKGGEmergency update to SEP required to address critical exploits made publicUpdate SEP Mgmt Server, puch updates to Windows Clients, reboot servers
88
6/30/20166:47 AM6/30/20168:33 AMGoogle CalendarConnect customers and Calendar onlyIf you're using Outlook, you'll see what was already sync'd, but no new entries. If you're using GWA, calendar won't load.unplannedCriticalProduction or mission critical systems are down and no work around is available.INFKGGhttp://www.google.com/appsstatus#hl=en&v=issue&sid=2&iid=847490285bf1b9e082a699bafb95f53bGoogle does not report RCA or other causes of outage to us. They do provide a history though.
89
6/28/20168:38 AM6/28/20169:00 AMTravelCampusunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.ESICMontecinoThe Travel application could not obtain SQL connections from the poolRestarting the Server (ISC-NET-PROD) solved the issue
90
6/25/20167:58 PM6/25/20168:20 PM
Kronos Electronic Timekeeping - application only (not timeclocks)
Campus - Electronic Timekeeping application usersunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.ESIAWegerTwo employees reported timecard display issues. No ability to timestamp, record hours, or approve.Thoughtclearing server cache might fixe problem but it didn't.
91
6/23/20168:00 AM6/23/201610:57 AMFlexcard Allocation ModuleBFS/CampusunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.ESIQuaziPerhaps network issues. SQL Server to application server could not communicate.
92
6/23/201611:00 PM6/24/20168:30 AMMainframe SMTP MailerBFSSMTP mailer process did not restart, delaying email delivery off mainframeunplannedMinorNo data lost, these are mostly job status reportingINFGrierSMTP process restarted manually. Mainframe is delivery-only, it does not support incoming email.SMTP restarted manually, investigating methods to notify outside of email notifications,
93
6/21/20165:53 PM6/21/20166:05 PMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.ESIKGGWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
94
6/20/20167:54 AM6/20/20165:12 PMFlexCardBFS/CampusIncorrect FlexCard exams notices sent via emailunplannedMinorNotices emailed in error, actual flexcard system not affectedINFKGGFlexcard and other applications ported from "tempest" to "fssqlprod" in late May. SQL server on tempest stopped at that time. tempest sql server restarted - exactly why being investigate - and incorrect flexcard exam notices emailed out. Problem identified early afternoon and tempest sql server disabled. tempest sql server disabled, VM will be archived. After Action review will identify and remedy internal communications lapse on request to archive machine.
95
6/18/20162:13 PM6/21/2016OngoingGoToMyPCBFS/CampusPassword attack on GoToMyPCunplannedCriticalProduction or mission critical systems are down and no work around is available.EUCMACPassword re-use attack, where attackers used usernames and passwords leaked from other websites to access the accounts of GoToMyPC users.Suspended all BFS accounts and requesting users to go reset password before re-enabling accounts.
96
6/16/20166:17 PM6/17/20168:25 AMhttps://it.ucsb.edu/CampusWeb site unavailableunplannedMinorService is informational, no application interruption.INFKGGapache not responding, appears to be problem with windows update - malicious sw removal tool. Applied update, restated server, verified functional it.ucsb.edu siteapplied registry edit to disable MRT updating. The vm already has SEP and CYlance Protect so MRT is unnecessary.
97
6/16/20162:47 PM6/16/20163:00 PMEspresso loginBFS/CampusThe Login page was un-availableunplannedMinorESIQuaziAn automated service update disabled internet connectivity.
98
6/13/201611:25 AM6/13/201611:35 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.ESIKGGWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
99
6/8/201610:50 AM6/8/201610:53 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.ESIKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
100
6/8/20167:46 AM6/8/20169:00 AMBARCBFS/CampusRestart requested by BARC OfficeunplannedMajorMajor functionality is impacted, but general operations is still available. Work around is available.ESIKCHOIWe have this problem with BARC for some years now that BARC would suddenly stop responding. The cause is unknown. Every time when that happens, we have to reboot the server. We currently have two scheduled daily restarts, one at 5:00am and the other at 5:15pm. Before we put in the scheduled restarts, the worst experience we had was a need to manually restart BARC three times a day. Even with the scheduled restarts, BARC still occasionally misbehaves once every week or two and we need to do a manual restart.
Loading...
 
 
 
DO NOT USE 2
Sheet35
Network
Asset Analysis
Assets
Assets (OLD)
System OS
Telephony
Sheet43
Systems Pivot
Pivot Table 23
Pivot Table 21
Systems out of Support Summary
Systems Lifecycle
Budget README
CGF by Category
CGF by Service
CGF Charts
Recharge Only
ETS Budget
CGF+Loan Budget
Services Matrix
Community Staff
Distribution of Campus IT Staff
Collaboration
U-Mail
2FA
Electronic Timekeeping
FSIP Stabilization
Identity
Outreach
NHDC
ValidationRules
FSIP Issue Trending
Programs, Projects, Work Efforts
Assets (EUC)
Zoom Deployment
VSaaS
Dashboard Status
DO NOT USE
Collaboration (Old)
EUC KPIs Metrics
Decommissioned - EUC Services Metrics
Sheet42
EUC KPIs Enterprise Services