ATLAS_result locked by file_upload_handler PID=-1


Advanced search

Message boards : Number crunching : ATLAS_result locked by file_upload_handler PID=-1

1 · 2 · Next
Author Message
Profile Yeti
Avatar
Send message
Joined: 20 Jul 14
Posts: 699
Credit: 22,597,832
RAC: 0
    
Message 5991 - Posted: 16 Jan 2017, 16:51:18 UTC

Had to reboot my PC and now I can not upload one result:

16/01/2017 17:16:45 | ATLAS@home | Started upload of xZ1KDm4JHnpn7jp7oou28CBqABFKDmABFKDm27FKDmQxHKDmP7Tfam_0_r160647471_ATLAS_result
16/01/2017 17:16:47 | ATLAS@home | [error] Error reported by file upload server: [xZ1KDm4JHnpn7jp7oou28CBqABFKDmABFKDm27FKDmQxHKDmP7Tfam_0_r160647471_ATLAS_result] locked by file_upload_handler PID=-1
16/01/2017 17:16:47 | ATLAS@home | Temporarily failed upload of xZ1KDm4JHnpn7jp7oou28CBqABFKDmABFKDm27FKDmQxHKDmP7Tfam_0_r160647471_ATLAS_result: transient upload error
16/01/2017 17:16:47 | ATLAS@home | Backing off 00:43:16 on upload of xZ1KDm4JHnpn7jp7oou28CBqABFKDmABFKDm27FKDmQxHKDmP7Tfam_0_r160647471_ATLAS_result

Profile Yeti
Avatar
Send message
Joined: 20 Jul 14
Posts: 699
Credit: 22,597,832
RAC: 0
    
Message 5992 - Posted: 16 Jan 2017, 16:56:18 UTC

HM, found more machines that can not upload results, so, it can't be the reboot. Something with the upolad-server seems to be wrong

Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 44
Credit: 509,829
RAC: 0
    
Message 5993 - Posted: 16 Jan 2017, 16:58:49 UTC - in response to Message 5991.

I have a the exact same error on one of my PCs, and on the other I have this error:
16/01/2017 20:53:48 | ATLAS@home | Temporarily failed upload of rRJODmh65mpnDDn7oo6G73TpABFKDmABFKDmXBLKDmABFKDm0HiLkm_0_r817322314_ATLAS_result: transient HTTP error

Both errors prevent the upload from completing, the BOINC stays in an infinite loop re-trying.

Any clues?

Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 44
Credit: 509,829
RAC: 0
    
Message 5994 - Posted: 16 Jan 2017, 17:00:14 UTC - in response to Message 5993.

And on one of the computers it was definitely not following a reboot.

Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 44
Credit: 509,829
RAC: 0
    
Message 5995 - Posted: 16 Jan 2017, 17:10:58 UTC - in response to Message 5994.

Now both report the same error: locked by file_upload_handler PID=-1

Hona
Send message
Joined: 9 Aug 14
Posts: 1
Credit: 134,945
RAC: 0
    
Message 5996 - Posted: 16 Jan 2017, 18:48:48 UTC - in response to Message 5991.

The same problem here.
I paused the running result until this is fixed.

Profile Claus Varming Lund
Send message
Joined: 8 Mar 15
Posts: 23
Credit: 13,104,473
RAC: 0
    
Message 5998 - Posted: 16 Jan 2017, 21:20:58 UTC

Same problem with upload here, so guess it's the upload server.
Moving to LHCb, temporary, if no news in the next 3h :-)

Profile Claus Varming Lund
Send message
Joined: 8 Mar 15
Posts: 23
Credit: 13,104,473
RAC: 0
    
Message 5999 - Posted: 16 Jan 2017, 21:46:20 UTC - in response to Message 5998.

I have uploaded 3 or 4 WU's there were completed after the 3 WU's that I cant upload, but all I complete now seems to upload fine.
Any ideas why some won't upload?

47an
Send message
Joined: 14 Jan 15
Posts: 24
Credit: 10,732,116
RAC: 0
    
Message 6000 - Posted: 16 Jan 2017, 23:46:19 UTC
Last modified: 16 Jan 2017, 23:55:48 UTC

Tue 17 Jan 2017 12:36:35 AM CET | ATLAS@home | Started upload of zGVKDmRpjmpnDDn7oo6G73TpABFKDmABFKDm8DGKDmABFKDmFSGWAo_0_r975819286_ATLAS_result
Tue 17 Jan 2017 12:36:35 AM CET | ATLAS@home | Started upload of lRBMDmNDMnpn7jp7oou28CBqABFKDmABFKDm27FKDm43HKDmPgaBJo_0_r1183985507_ATLAS_result
Tue 17 Jan 2017 12:36:37 AM CET | ATLAS@home | [error] Error reported by file upload server: [zGVKDmRpjmpnDDn7oo6G73TpABFKDmABFKDm8DGKDmABFKDmFSGWAo_0_r975819286_ATLAS_result] locked by file_upload_handler PID=-1
Tue 17 Jan 2017 12:36:37 AM CET | ATLAS@home | [error] Error reported by file upload server: [lRBMDmNDMnpn7jp7oou28CBqABFKDmABFKDm27FKDm43HKDmPgaBJo_0_r1183985507_ATLAS_result] locked by file_upload_handler PID=-1

So it looks that there is no ID to these task in db so they are rejected to be uploaded.
Have aborted 5 task but keep getting new task each day to some host.

Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 44
Credit: 509,829
RAC: 0
    
Message 6004 - Posted: 17 Jan 2017, 2:52:30 UTC - in response to Message 6000.

Could this be, by any chance, a consequence of the on-going challenge? I never saw this since I joined BOINC (just a month ago though).

Fuzzy Duck
Send message
Joined: 3 Dec 15
Posts: 33
Credit: 5,074,231
RAC: 0
    
Message 6007 - Posted: 17 Jan 2017, 8:09:40 UTC - in response to Message 5991.
Last modified: 17 Jan 2017, 8:10:56 UTC

I am also having problems uploading just specific WU's on one machine.

From event log:
17/01/2017 08:07:37 | ATLAS@home | [error] Error reported by file upload server: [0aUKDmy3hmpn7jp7oou28CBqABFKDmABFKDm27FKDm27GKDmTYBoNo_2_r420929842_ATLAS_result] locked by file_upload_handler PID=-1
17/01/2017 08:07:37 | ATLAS@home | Temporarily failed upload of 0aUKDmy3hmpn7jp7oou28CBqABFKDmABFKDm27FKDm27GKDmTYBoNo_2_r420929842_ATLAS_result: transient upload error

Profile Michael H.W. Weber
Send message
Joined: 10 Jan 15
Posts: 108
Credit: 1,552,848
RAC: 0
    
Message 6008 - Posted: 17 Jan 2017, 9:03:16 UTC
Last modified: 17 Jan 2017, 9:04:22 UTC

Same issue over here:

17.01.2017 07:56:48 | ATLAS@home | Started upload of G9SLDm9HOnpnDDn7oo6G73TpABFKDmABFKDm57FKDmABFKDmd8uBJo_0_r611626053_ATLAS_result
17.01.2017 07:56:48 | ATLAS@home | Started upload of KNGLDm9JPnpnDDn7oo6G73TpABFKDmABFKDmiZFKDmABFKDm3A23Fo_0_r1133623716_ATLAS_result
17.01.2017 07:56:49 | ATLAS@home | [error] Error reported by file upload server: [G9SLDm9HOnpnDDn7oo6G73TpABFKDmABFKDm57FKDmABFKDmd8uBJo_0_r611626053_ATLAS_result] locked by file_upload_handler PID=-1
17.01.2017 07:56:49 | ATLAS@home | [error] Error reported by file upload server: [KNGLDm9JPnpnDDn7oo6G73TpABFKDmABFKDmiZFKDmABFKDm3A23Fo_0_r1133623716_ATLAS_result] locked by file_upload_handler PID=-1
17.01.2017 07:56:49 | ATLAS@home | Temporarily failed upload of G9SLDm9HOnpnDDn7oo6G73TpABFKDmABFKDm57FKDmABFKDmd8uBJo_0_r611626053_ATLAS_result: transient upload error
17.01.2017 07:56:49 | ATLAS@home | Backing off 05:08:27 on upload of G9SLDm9HOnpnDDn7oo6G73TpABFKDmABFKDm57FKDmABFKDmd8uBJo_0_r611626053_ATLAS_result
17.01.2017 07:56:49 | ATLAS@home | Temporarily failed upload of KNGLDm9JPnpnDDn7oo6G73TpABFKDmABFKDmiZFKDmABFKDm3A23Fo_0_r1133623716_ATLAS_result: transient upload error
17.01.2017 07:56:49 | ATLAS@home | Backing off 04:48:55 on upload of KNGLDm9JPnpnDDn7oo6G73TpABFKDmABFKDmiZFKDmABFKDm3A23Fo_0_r1133623716_ATLAS_result

I tried repeatedly to manually upload these queued task results but they just won't upload while other BOINC projects run fine.

Michael.
____________
President of Rechenkraft.net

Profile Michael H.W. Weber
Send message
Joined: 10 Jan 15
Posts: 108
Credit: 1,552,848
RAC: 0
    
Message 6020 - Posted: 18 Jan 2017, 19:46:46 UTC

The files are still queued for upload. What is going on?

Michael.
____________
President of Rechenkraft.net

PHILIPPE
Send message
Joined: 24 Jul 16
Posts: 84
Credit: 53,413
RAC: 0
    
Message 6021 - Posted: 18 Jan 2017, 20:47:31 UTC - in response to Message 6020.

Searching on the net , i find the same facts in a forum but for seti project in 2007.

I hope it will be solved with less pain.

Are only the muticore tasks concerned ?

For me who run single-core,no problem for the moment...

Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 44
Credit: 509,829
RAC: 0
    
Message 6025 - Posted: 19 Jan 2017, 2:12:51 UTC - in response to Message 6021.

Are only the muticore tasks concerned ?

In my case the problem occurred with both types.

47an
Send message
Joined: 14 Jan 15
Posts: 24
Credit: 10,732,116
RAC: 0
    
Message 6040 - Posted: 21 Jan 2017, 23:39:42 UTC
Last modified: 21 Jan 2017, 23:41:23 UTC

Project admins: We still have pending transfer for upload and they will take up space, i abort them in order to get space to other task. If there are no solution to PID=1 then could you send a cancel to task?

As i get it these task have no id to wu and would therefore never have an id in database so these task are ghost task to users boinc managers. These VM task could prevent host to do other stuff. As there are a limit to boinc of 100 Gb to store these VM task do fill a host storage. We need to clear out our slots folder.

Profile Michael H.W. Weber
Send message
Joined: 10 Jan 15
Posts: 108
Credit: 1,552,848
RAC: 0
    
Message 6051 - Posted: 23 Jan 2017, 8:32:47 UTC

To date no answer.
Several tasks are still in the upload queue.
Would it PLEASE be possible for the project team to give some feedback?

It is one thing to have a technical problem.
It is a totally different thing not to communicate it in a timely fashion.

Michael.
____________
President of Rechenkraft.net

David Cameron
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 13 May 14
Posts: 252
Credit: 2,028,556
RAC: 1
    
Message 6052 - Posted: 23 Jan 2017, 8:57:03 UTC

I had a look, it seems most WU are successful and some are failing to upload, but I can't tell why. The only messages I have in the log is

2017-01-23 09:52:35.7969 [PID=29913] [CRITICAL] handle_get_file_size(): [XaqMDmu45hpn7jp7oou28CBqABFKDmABFKDmO5GKDmnMOKDm0iHenn_1_r67227843_ATLAS_result] returning error

I have restarted the web server so let me know if that helped. Meanwhile I will continue to investigate...

Profile Yeti
Avatar
Send message
Joined: 20 Jul 14
Posts: 699
Credit: 22,597,832
RAC: 0
    
Message 6053 - Posted: 23 Jan 2017, 9:00:46 UTC
Last modified: 23 Jan 2017, 9:01:21 UTC

David, it has to do with interrupted uploads.

I could watch one upload and when the big file was nearly full uploaded there was some kind of error and the upload was interrupted.

From this point on the error starts to be seen. It must have to do something with "the file already exists (not complete)" or "the File-Uploader keeps this file open" and it can not be created a second time, can not be overwritten.

Profile Michael H.W. Weber
Send message
Joined: 10 Jan 15
Posts: 108
Credit: 1,552,848
RAC: 0
    
Message 6054 - Posted: 23 Jan 2017, 9:29:08 UTC - in response to Message 6052.

I have restarted the web server so let me know if that helped. Meanwhile I will continue to investigate...

...unfortunately, so far, it did not help.

Michael.
____________
President of Rechenkraft.net

1 · 2 · Next

Message boards : Number crunching : ATLAS_result locked by file_upload_handler PID=-1