We're aware of an issue impacting some servers with IDs beginning with srv3xxx, and we are investigating now.
If your server ID begins with srv3xxx, it's due to this issue.
Original information from CoreSite LA2 Team: Description: CoreSite field operations has confirmed we experienced a power quality event at the data center. Due to the utility voltage inconsistency, Our Generators did turn on. Our local Data Center operations team is currently investigating further on the cause and will provide further updates as we receive them. You may direct any questions or concerns to the below department:
Latest update from Coresite LA2 Team: Site has return to normal Utility Source with no Data Center Operations intervention. DCO team is currently continuing site checks.
We will update this posting as we have more information.
Update: One of our technicians has been deployed and is currently en route to address the situation.
Update from Coresite LA2. At this time we too are going to close this incident.
Update: Data Center Operations has confirmed all systems remain stable in their current configuration after our last PQE. All UPS systems are on inverter, with battery back up available should it be required. UPS 1B is still charging batteries and will require at least another 12 hours to reach full charge. All building load is on utility power with the exception of Mezzanine 1 (CR185), the Mezzanine remains on the emergency source generators awaiting arrival from our support vendor to troubleshoot our transfer device, estimated time of arrival is 4:00 pm pacific. We will be closing this incident communication at this time and focusing our communications to those clients located in the Mezzanine who remain supported by our generators.
Update from Coresite LA2:
Update: We have experienced another PQE at the facility. The site is performing rounds to confirm all systems are operating properly. UPS 1B is back on inverter and all PDUs are restored.
Update from Coresite LA2:
Update: Data Center Operations needs to restore a PDU breaker downstream of UPS 1B. In order to do this, we will need to place UPS 1B back into bypass. Once the PDU is restored, we will place UPS 1B back on inverter, we expect this evolution to be 20 minutes.
Update from Coresite LA2:
Update: Data Center Operations has placed UPS 1B back on inverter, all UPS systems are supported by battery back up presently. Mezz1 remains on the emergency source generators with a vendor dispatched, arriving later today, to help troubleshoot the transfer device for this system. All other spaces are supported by the normal source utility.
Update from Coresite LA2:
Update: Data Center Operations is reporting a subsequent PQE has occurred at the facility. The team is currently verifying all systems for proper functionality.
Description: CoreSite field operations has confirmed we experienced a power quality event at the data center. Due to the utility voltage inconsistency, Our Generators did turn on. Our local Data Center operations team is currently investigating further on the cause and will provide further updates as we receive them.
Update from Coresite LA2:
Update: Data Center Operations confirms only Mezz1 remains supported by the emergency source generators, all other spaces are supported by normal source utility. our UPS vendor has arrived onsite and is working to put the UPS 1B back on inverter, all other UPS systems are in normal. We have also dispatched an additional vendor to help troubleshoot our transfer system.
Update from Coresite LA2:
Update: Data Center Operations is confirming both Mezz1 and Mezz 2 are being supported by the emergency source generators, all other spaces are supported by normal source utility. We have brought UPS1B system modules back online and are charging batteries while the system remains in bypass, all other UPS systems are in normal.
Update from Coresite LA2:
===. Update: Data Center Operations has confirmed all systems are online downstream of UPS1B however, this load is on bypass without battery backup availability. ===.
This means that all single-PSU servers should likewise be on now. If finding your server inaccessible, please try to power it on in SynergyCP if not already powered on (just need to power it on once; trying to power it on many times won't make it power on if the first time didn't work), and check the KVM console to see if the server is booted into the operating system, before opening a support ticket.
We'll post another update when UPS1B is re-engaged.
Update from Coresite LA2 below. This means single PSU servers should be coming online shortly.
Update: Data Center Operations has confirmed all site load is back on normal source utility power. We have also successfully taken UPS1B system into maintenance bypass and are working to restore downstream breakers one at a time to reduce inrush.
Update from Coresite LA2:
Update: Data Center Operations team will need to back out of Generator and will place site back to Utility source. Our Data Center Operations team has found an issue upstream with Generator transfer switch. Site will remain as it was prior while the team works placing UPS 1B in to maintenance bypass to troubleshoot the issues.
This incident is mostly resolved. Coresite continues to work on A+B power distribution issues. This means that some servers may be offline if they are single PSU. These can include:
SOME E3-1270 servers that do not have "(mc)" or "(vlp)" in their SynergyCP CPU name. SOME E3-1650v3/v4. SOME Ryzen 5900X. SOME colocation with single PSU, or dual PSUs with a dead PSU.
These coming back online are dependent on Coresite restoring full A+B feeds. We will update this status posting when we have been notified of power stability.
Update from Coresite LA2: Update: Site is on generator power while our team continues to troubleshoot issue taking place at the moment.
CoreSite Update: Please be advise our DCO team has found that there is a partial outage on our PDUs that your power is on. Our team will be workign on placing the unit into maintenance bypass and get a vendor out to help with troubleshooting efforts. We will provide further updates as we receive them.
Dedicated.com Translation: The PDUs mentioned are not our cabinet-specific PDUs connected to our equipment. Instead, these are the facility's PDUs that supply power to our cabinets, which explains why some cabinets have not fully recovered.
Actionable items: We have identified that some of our cabinets are operating on a single surviving power leg. Our technician is already onsite reviewing the situation and, if necessary, relocating single PSU devices (primarily customer colocation equipment) to the surviving power legs.
Unfortunately, the power cutovers from CoreSite were not smooth and resulted in some outages. Our network devices are confirmed to be online, and the remaining outages are limited to single PSU devices on the unrecovered feeds. We are actively working to move these devices to the surviving leg where possible.
We will provide further updates as the situation evolves. Thank you for your patience and understanding.
CoreSite Update: Data Center Operations team found power feed PDUs in LL partially offline. DCO team is currently reaching out to Vendor for a dispatch to assist in troubleshooting. We will provide ETA of vendor dispatch once received.
Dedicated.com Translation: These are not to be confused with our own cabinet PDUs connected to equipment in our cabinets, the mentioned PDUs feed our cabinets, this explains why some of our cabinets have not fully recovered.
No incidents reported
No incidents reported
No incidents reported
No incidents reported
No incidents reported
No incidents reported
No incidents reported
Due to https://status.dedicated.com/incidents/44 SynergyCP is currently inaccessible. We recommend subscribing to status updates at the bottom of the status page, to receive updates as we post them.
If you need an operation performed for a mission-critical server in a location that is not New York, please open a ticket at https://my.dedicated.com/clientarea.php with your justification, and we will engage alternate means of control for your server.
We are investigating connectivity issues for our New York location now. We will update this posting as we have more information.
The switch stack for series srv54xx servers has been fixed, which makes New York fully operational at this point.
If you're noting an issue with your server or colo, please reach out with details and we can investigate further.
We have been granted access to our pod. A reminder to please check your server and submit a ticket if you need assistance after checking it. Please see the previous posting for specific instructions regarding checking your server and ticketing: https://status.dedicated.com/incidents/44#update-122
We are ready for per-device colo and rented server issues. There's thousands of rented servers and colo devices in the New York deployment, expecting us to check each of them individually is not possible.
Please refrain from opening "my server is down" tickets without further information. Please work to check your server. If your server is unreachable, please open a ticket showing your server's KVM console and what you're currently seeing on the console, along with investigation steps you've taken already.
Tickets opened as "my server is down" full stop with no information will be deprioritized against tickets that showing that you've checked your server and it's still unreachable.
Reminder: servers in series srv54xx are still unreachable due to the switch issue. Remotely, we've determined that the switch has corrupted both its primary and backup OS images, and will need to be restored when our team has been able to gain access to the building, which is still in progress. Please don't open tickets for servers in series srv54xx at this time; we are aware and will post an update for them soon.
Please refrain from opening "my server is down" tickets. We've reached out to you if you're an enterprise customer to confirm that your stack is up. We'll update this posting when we're ready to work rented server and single-server colo down.
We are still awaiting building access. Pod cams show no fire damage.
Most things have come online by themselves after our pod was energized. We have reached out to our enterprise customers to have them confirm any power or network issues. We are working to confirm rented servers are online.
Currently known issues: -Servers in series srv54xx are still offline as the switch did not come online. We are investigating. -The server control system remains inaccessible to end users as we confirm the scope of what may still have issues.
Our pod has been energized. Things are coming online. We're still working on site access. We'll post more info as this progresses.
DC Update:
Our onsite team is currently bringing our UPS Systems online. We now have UPS-4 and UPS-R online. While bringing up UPS-1, we ran into an issue and we are unable to bring it online. We have engaged with our vendor and are finding a workaround to deliver power downstream to customer cabinets.
Due to the issue with UPS-1, we will not be automatically powering up all customer cabinets. We will be reaching out to you to let you know if you are part of the group of customers on unprotected/non-UPS backed utility power so that you can make the decision whether to energize your cabinets at that time.
We are currently at 50% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 3 hours. The current time frame for when clients will be able to come back onsite is approximately 10:30PM EDT.
We are currently sourcing materials to bring our fire system fully online and do not have an ETA for completion. Because of this and fire marshal compliance we will only be allowed to have supervised escorted customer access when we finish bringing up the critical infrastructure systems. We will have additional personnel onsite to assist us with this escort policy.
We are not part of UPS-1.
DC Update:
Our onsite team is currently bringing our UPS Systems online. We have our UPS vendor onsite assisting us with this. We have brought UPS-4 online. We resolved our issues with UPS-R and have brought it online and are currently charging it's associated battery system. We will now begin work on bringing UPS-1 online. After bringing the UPS Systems online, we will then fully transfer load to our downstream power distribution, which will enable us to start powering up individual customer cabinets.
We are currently at 40% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 4 hours. We are still planning for an evening time frame when clients will be able to come back on site.
We are currently sourcing materials to bring our fire system fully online and do not have an ETA for completion. Because of this and fire marshal compliance we will only be allowed to have supervised escorted customer access when we finish bringing up the critical infrastructure systems.
DC Update:
Our onsite team is currently bringing our UPS Systems online. We have our UPS vendor onsite assisting us with this. We have brought UPS-4 online and are currently charging it's associated battery system. While bringing UPS-R online we have run into a minor issue that we are currently investigating.
We are currently at 35% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 5 hours. We are still planning for an evening time frame when clients will be able to come back on site.
In process with our power system re-energizing, we have been working on our fire system as well. We are currently sourcing materials to bring our fire system fully online and do not have an ETA for completion. Because of this and fire marshal compliance we will only be allowed to have supervised escorted customer access when we finish bringing up the critical infrastructure systems. We are currently sourcing additional personnel to assist us with this escort policy.
DC update:
Our onsite team is currently bringing our UPS Systems online. We have our UPS vendor onsite assisting with this as we bring up UPS-4 and then UPS-R after that.
As these systems are brought online, we will concurrently work on bringing carriers up.
We are currently at 30% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 6 hours. We are still planning for an evening time frame when clients will be able to come back on site.
A reminder that we do not use the building's carrier blend, so our carrier turn-up will be different.
DC update:
Our onsite team has energized the primary electrical equipment that powers the site, enabling us to bring our mechanical plant online. We are currently cooling the facility.
As we monitor for stability, we are focused on bringing up our electrical systems. In starting this process, we have identified an issue with powering up our fire panel as well as power systems that were powered by UPS3. While this will cause us a delay, we are working with our vendors for remediation.
We are currently at 25% for completion toward bringing the site back online and the revised ETA for bringing up the critical infrastructure systems is approximately 7 hours. We are still planning for an evening time frame when clients will be able to come back on site. We will send out additional information regarding access to the facility and remote hands assistance and we will notify you once client access to the facility is permitted.
When our team is permitted on-site, our plan is as follows: Our COO will be on site waiting to enter the building as soon as possible, and will start powering everything up starting with the network edge/core, followed by racks in order of rack number.
Each rack will be worked to ensure it's up and running before moving on to the next.
If you have a full rack with access, please be advised that access would not be permitted until everything is online. We appreciate that you may want to prioritize your services online before others, but we are unable to permit such prioritization, and interference will only delay our ability to perform work.
We fully understand the gravity of the situation, and everyone will be brought up at maximum urgency.
Our team will remain on site after everything is up for a period of time to ensure the facility is going to remain stable in addition to our own services.
We will continue to post updates as we have them.
DC's hourly update:
We have completed the full site inspection with the fire marshal and the electrical inspector and utility power has been restored to the site.
We are now working to restore critical systems and our onsite team has energized the primary electrical equipment that powers the site. Concurrently, we are beginning work to bring the mechanical plant online. Additional engineers from other facilities are on site this morning to expedite site turn up.
The ETA for bringing up the critical infrastructure systems is approximately 5 hours.
We are planning for a late afternoon/early evening time frame when clients will be able to come back on site.
Datacenter update:
Our site inspection this morning went well and we have been granted authorization to restore utility power to the site and are currently working on re-energizing utility power to the facility. Our onsite team is working with the fire marshal and electrical inspectors, ensuring electrical system safety as we prepare to bring utility power back to the site.
Once that is completed, we will work towards bringing up our critical infrastructure systems. This will take approximately 5 hours.
While we are working on that, we will also be working on our fire/life safety systems as we need to replace some smoke detectors and have a full inspection of the fire system prior to allowing customers to enter the facility.
We will be sending out hourly updates as we make progress on bringing the facility back online.
Preliminary update from our CSM:
I heard the preliminary inspection is good and we are taking steps to energize the property now.
I’m waiting for the official update from DC Ops. More to come.
We'll post more info as soon as we have it, in addition to our power-up plan.
Below is the latest update:
The EWR Secaucus data center remains powered down at this time per the fire marshal. We are continuing with our cleanup efforts into the evening and working overnight as we make progress towards our 9AM EDT meeting time with the fire marshal and electrical inspectors in order to reinstate power at the site.
Once we receive approval and utility is restored, we will turn up critical systems. This will take approximately 5 hours. After the critical systems are restored, we will be turning up the carriers and then will start to turn the servers back on.
The fire marshal has requested replacement of the smoke detectors in the affected area as well as a full site inspection of the fire life safety system prior to allowing customers to enter the facility. Assuming that all goes as planned, the earliest that clients will be allowed back into the site to work on their equipment would be late in the day Wednesday.
We will notify you when client access to the facility has been approved. Please open a separate ticket per standard process to request additional badge access if needed prior to arrival.
This will be the last update of the day. We will provide further updates tomorrow after our site inspection.
Note that our process to turn up carriers and servers will be different as we do not use the datacenter's carrier blend. We will share our plan when we have a more finite definition of when we can be expected to be allowed into the building to perform turn up on our equipment.
We have received the following disappointing response from the datacenter:
The EWR Secaucus data center remains powered down at this time per the fire marshal.
We have just finished the meeting with the fire marshal, electrical inspectors, and our onsite management. We have made great progress cleaning and after reviewing it with the fire marshal, they have asked us to clean additional spaces and they have also asked us to replace some components of the fire system. They have set a time to come back and review these requests at 9am EDT Wednesday. We are working to comply completely with these new requests with these vendors and are bringing in additional cleaning personnel onsite to make the fire marshal's deadline.
In preparation for being able to allow clients onsite, the fire marshal has stated that we need to perform a full test of the fire/life safety systems which will be done after utility power has been restored and fire system components replaced. We have these vendors standing by for this work tomorrow.
Assuming that all goes as planned, the earliest that clients will be allowed back into the site to power up their servers would be late in the day Wednesday.
We are working to see what alternatives we have, if any. Again, please continue to engage your business continuity plans.
As we have not heard back about the results of the fire marshal/electric utility/DC ops meeting, we have pinged for an update.
Datacenter update:
The EWR Secaucus data center remains powered down at this time per the fire marshal.
Site management, the fire marshal, and electrical contractors are currently meeting to review the process of the cleaning effort to get approval from the fire marshal to re-energize the site.
Access update for our team from the datacenter:
VP of DC Ops will be sending out instructions for re-entry to the site. If all goes as planned, it will be around 6:00/7:00PM. We need to re-energize the critical infra at the site and get it cooled down prior to giving customer access. Will take 4 to 5 hours assuming Fire Marshall gives all clear.
Mid-day update from the datacenter:
The EWR Secaucus data center remains powered down at this time per the fire marshal. We continue to clean and ready the site for final approval by the fire marshal in order to re-energize the facilities critical equipment. Site management, the fire marshal, and electrical contractors will be meeting at 2PM EDT in an attempt to receive approval from the fire marshal to re-energize the site. We do not foresee any issues that would result in not receiving such approval.
Re-energizing critical equipment will take 4-5 hours. After this process, we will be energizing customer circuits and powering on all customer equipment. We will provide updates as to when customers will be allowed in the facility once approved by the fire marshal.
Addressing "is our data safe?" concerns: we haven't been able to actively make that determination yet.
Our understanding of the scope of the issue is that the fire was limited to an UPS in an electrical room, which was extinguished almost immediately by on-site fire suppressant, and did not extend to the data halls. This means that data should indeed be safe, however, our team has not been permitted on site yet, and power remains off to the building.
Our expectation is that when power is restored, we should be able to power on all routers, switches, and servers normally, and any data loss would be limited to anything resulting from a hard power off situation, but again, this isn't confirmable yet.
In the event that you need to restore data from off-site backups, we can reset your monthly bandwidth cap to ensure that you can pull down any data you need from your off-site backups.
We will continue to post updates as we have them.
Update from the datacenter below. Their message did not specify a next update time, I've asked when we can expect to be updated.
Power remains off at our data center in Secaucus/EWR1 per the local fire marshal.
Current status update from DC Ops:
Our remediation vendor and our team has worked through the night to clean the UPS' at the request of the fire marshal. They have made significant progress and we hope to have the cleaning completed by mid-day, at which time we will engage the fire marshal to review the site. Following their review, we hope to get a sign off from them so that we can start the reenergizing process. The reenergizing process can take 4-5 hours, as we need to turn up the critical infrastructure prior to any servers.
Current status from the datacenter below. We've asked for that 8 AM EDT update, as the time frame has come and gone.
Power remains off at our data center in Secaucus/EWR1 per the local fire Marshall.
After reviewing the site, the fire Marshall is requiring that we extensively clean the UPS devices and rooms before they will allow us to re-energize the site. We have a vendor at the site currently who will be performing that cleanup. We will provide an update at 8:00AM EDT unless something significant changes overnight.
We will continue to provide updates as we receive them.
Statement from the datacenter itself:
Power remains off at our data center in Secaucus/EWR1 per the local fire marshal.
We have had an electrical failure with one of our redundant UPS' that started to smoke and then had a small fire in the UPS room. The fire department was dispatched and the fire was extinguished quickly. The fire department subsequently cut power to the entire data center and disabled our generators while they and the utility verify electrical system. We have been working with both the fire department and the utility to expedite this process.
We are currently waiting on the fire marshal and local utility to reenergize the site. We are completely dependent upon their inspection and approval. We are hoping to get an update that we can share in the next hour.
At the current time, the fire department is controlling access to the building and we will not be able to let customers in.
We've received the update: an isolated fire in an UPS in an electrical room was detected and put out by fire suppression. The local fire department arrived on the scene, and per NEC guidelines and likely local laws and general best practices for firefighters, cut the power to the building. This caused the down -> up -> down cycle noted earlier today.
Current state is that datacenter electricians are on site awaiting access to the building to perform repair work to the UPS, but are currently waiting permission from the fire department to enter the building.
Once the electrical work is complete, the power will be applied to HVAC to subcool the facility, which will take an estimated 3-4 hours, and at that point, power will be restored to data halls, which will bring our network and servers back online.
The datacenter manager gave a best-case ETA of tomorrow morning, July 11th, for power to be restored to data halls. Again, please engage your business continuity plans as this does remain an open-ended outage outside of our control.
We will post mode updates as we have them.
We expect to be updated within the hour. We will pass on information we receive.
This situation remains in progress. We will post updates as we have them.
We've been informed that an electrical room experienced a fire, was put out by retardant, and the datacenter is in emergency power off status at the requirement of on-site fire fighters. Our COO is on site, but we are presently unable to access the building. We are working to learn more. We do not have further information at this time.
As this outage does not have a clearly defined resolution and is outside of our control at this time, please execute your business continuity plans accordingly.
We will continue to post updates here as we have them.
The datacenter is experiencing a fire and is in emergency power off. We don't have further information at this time. We'll update this posting as we have more information.
We are seeing this potentially reoccur. We are continuing to investigate, and continue to engage the datacenter to understand the situation.
All servers should be back online with the exception of NY servers starting with ID srv54xx. We are working on this server series now. We will post more information as we have it.
We seem to have experienced some kind of DC-wide power event. We are still investigating. We will post updates here as we have them.