Ptree corruption

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Ptree corruption

Javier Henderson-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I run two keyservers, one has just two peers, the other has about a dozen.

The one with a dozen peers sees periodic corruption, necessitating a rebuild. From what I've seen after asking the great google mind, this seems to happen periodically, and there was even a posting this morning by someone dropping out of the ring because of this problem.

Both of the keyservers I run are on 1.1.1, which I believe is latest and greatest.

Any thoughts on this?

- -jav

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.12 (Darwin)

iEYEARECAAYFAkvphmcACgkQFbNc8rut2uLbiQCglp7TBCiEDY8y0PN44FlME8nn
yQ4AoOgGsrc/fP/inBK0khN38CAlv2PD
=+JGv
-----END PGP SIGNATURE-----

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ptree corruption (and Debian .deb)

Teun Nijssen
Hi all,

on 2010-05-11 18:31 Javier Henderson wrote the following:
> I run two keyservers, one has just two peers, the other has about a dozen.
>
> The one with a dozen peers sees periodic corruption, necessitating a rebuild. From what I've seen after asking the great google mind, this seems to happen periodically, and there was even a posting this morning by someone dropping out of the ring because of this problem.
>
> Both of the keyservers I run are on 1.1.1, which I believe is latest and greatest.
>
> Any thoughts on this?

I'm running pgp.surfnet.nl which existed even before the Horowitz server was
developed. I have in the past years run SKS straight from the Debian
repositories on solid hardware, currently 1.1.0. So far I never had any
database corruption, neither DB nor Ptree.

A couple of days ago I setup a 1.1.1 server (xw0117.uvt.nl) on Ubuntu 1004. It
recons only with pgp.surfnet.nl. On purpose I pre-loaded only 90% of the
Pramberger dump, to stress test pgp.surfnet .nl when it had to recon ~200k
keys. That took a few hours, but everything works and the surfnet machine had
no problem at all spitting out multiple 100 key blobs.

My only real problem appears to be locking on the surfnet box, when elsewhere
people start reconning ;large numbers of keys. The server always continuous
running but Nagios gets upset with its slowness during the locking.

Does anyone running Debian the 'apt-get install sks' version 1.1.1 release
have corruption problems?

cheers,

teun


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel

signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ptree corruption

John Clizbe-3
In reply to this post by Javier Henderson-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Javier Henderson wrote:

> I run two keyservers, one has just two peers, the other has about a dozen.
>
> The one with a dozen peers sees periodic corruption, necessitating a rebuild.
> From what I've seen after asking the great google mind, this seems to happen
> periodically, and there was even a posting this morning by someone dropping out
> of the ring because of this problem.
>
> Both of the keyservers I run are on 1.1.1, which I believe is latest and greatest.
>
> Any thoughts on this?

Since rebuilding my server in June of last year, I've seen the PTree corrupted
error three times, each time preceded by large numbers of recovered hashes

2010-04-12 16:44:43 Raising Sys.Break -- PTree may be corrupted:
Bdb.DBError("unable to allocate memory for mutex; resize mutex region")

2010-04-22 10:23:32 Raising Sys.Break -- PTree may be corrupted:
Bdb.DBError("unable to allocate memory for mutex; resize mutex region")

2010-05-05 02:11:23 Raising Sys.Break -- PTree may be corrupted:
Bdb.DBError("unable to allocate memory for mutex; resize mutex region")

sks 1.1.1 with db-4.8 until 14-Apr-2010, db-5.0 currently on Slackware

I recover the same way each time, db_recover followed by db_checkpoint

I can forward the log files, or excerpts, if anyone thinks they have a fix.

- --
John P. Clizbe                      Inet:  John (a) GingerBear.net
You can't spell fiasco without SCO. hkp://keyserver.gingerbear.net  or
     mailto:[hidden email]?subject=HELP

Q:"Just how do the residents of Haiku, Hawai'i hold conversations?"
A:"An odd melody / island voices on the winds / surplus of vowels"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11-svn5331-2010-05-07 (Windows XP)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!
Comment: Be part of the £€37 ECHELON -- Use Strong Encryption.
Comment: It's YOUR right - for the time being.
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAkvqazQACgkQ614Z89ZWmCU0ngD/eB1KJarLtqFFh2D7HGCNuGjK
Rmgfnr5jYy/HxsT05jUA/0fBpIOFQlW37JMMLuOg9l3uSBDoTxSoNNYjU+j5EDeA
=cvJR
-----END PGP SIGNATURE-----

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ptree corruption

John Clizbe-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

John Clizbe wrote:

> Javier Henderson wrote:
>> I run two keyservers, one has just two peers, the other has about a dozen.
>
>> The one with a dozen peers sees periodic corruption, necessitating a rebuild.
>> From what I've seen after asking the great google mind, this seems to happen
>> periodically, and there was even a posting this morning by someone dropping out
>> of the ring because of this problem.
>
>> Both of the keyservers I run are on 1.1.1, which I believe is latest and greatest.
>
>> Any thoughts on this?
>

Googling the BDB error, and doing a bit of checking, it seems mutexes are indeed
a scarce quantity for the PTree database:

> sks@yogi:/var/sks# db50_stat -eh PTree
<SNIP>

> Active transactions:
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> 7MB 176KB       Mutex region size
> 0       The number of region locks that required waiting (0%)
> 4       Mutex alignment
> 1       Mutex test-and-set spins
> 58720   Mutex total count
> 46      Mutex free count
> 58674   Mutex in-use count
> 58674   Mutex maximum in-use count
> Mutex counts
> 46      Unallocated
> 1       env region
> 1       log filename
> 1       log flush
> 1       log region
> 1       mpoolfile handle
> 42267   mpool buffer
> 17      mpool file bucket
> 16381   mpool hash bucket
> 1       mpool region
> 1       mutex region
> 1       transaction checkpoint
> 1       txn region

I'm adding 'mutex_set_max 65535' to PTree's DB_CONFIG and we'll see if that
takes care of it. I ran db_recover on PTree before restarting.

KDB had ~10% free mutexes, so I didn't see an issue there.

- --
John P. Clizbe                      Inet:  John (a) GingerBear.net
You can't spell fiasco without SCO. hkp://keyserver.gingerbear.net  or
     mailto:[hidden email]?subject=HELP

Q:"Just how do the residents of Haiku, Hawai'i hold conversations?"
A:"An odd melody / island voices on the winds / surplus of vowels"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11-svn5331-2010-05-07 (Windows XP)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!
Comment: Be part of the £€37 ECHELON -- Use Strong Encryption.
Comment: It's YOUR right - for the time being.
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAkvqcXcACgkQ614Z89ZWmCXHmgD/RwHpwK8y2VdQ+BthJKUQ1Rlq
0MZrIjYl7xKEdN9QXvsA/Ah0Z00mWIlo3rGvLAvtWrRrbScdZr2QqnrzqhEZMuJb
=ZWSF
-----END PGP SIGNATURE-----

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ptree corruption

John Clizbe-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

John Clizbe wrote:
>
> I'm adding 'mutex_set_max 65535' to PTree's DB_CONFIG and we'll see if that
> takes care of it. I ran db_recover on PTree before restarting.

Sorry for replying to myself, but after rereading the BDB docs,
http://www.oracle.com/technology/documentation/berkeley-db/db/api_reference/C/mutexset_max.html,
it would seem running db_recover is mandatory after setting mutex_set_max in
DB_CONFIG:

    If the database environment already exists when DB_ENV->open() is called,
    the information specified to DB_ENV->mutex_set_max() will be ignored.

- --
John P. Clizbe                      Inet:  John (a) GingerBear.net
You can't spell fiasco without SCO. hkp://keyserver.gingerbear.net  or
     mailto:[hidden email]?subject=HELP

Q:"Just how do the residents of Haiku, Hawai'i hold conversations?"
A:"An odd melody / island voices on the winds / surplus of vowels"


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11-svn5331-2010-05-07 (Windows XP)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!
Comment: Be part of the £€37 ECHELON -- Use Strong Encryption.
Comment: It's YOUR right - for the time being.
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAkvqc/UACgkQ614Z89ZWmCXxbQD+LlcmcdU6DExPFisasiKGFiPm
igQGoKv6vV5ZYdVLT6MA/0b1xV5/U3vPuTLkm2/KIfpz5thBtwHJ1jBh1Op/DD9Z
=m+W/
-----END PGP SIGNATURE-----

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ptree corruption (and Debian .deb)

Dinko Korunic
In reply to this post by Teun Nijssen
On Wed, May 12, 2010 at 09:47:47AM +0200, Teun Nijssen wrote:
> I'm running pgp.surfnet.nl which existed even before the Horowitz server was
> developed. I have in the past years run SKS straight from the Debian
> repositories on solid hardware, currently 1.1.0. So far I never had any
> database corruption, neither DB nor Ptree.

I second this. I have been running pks.aaiedu.hr (former pks.carnet.hr)
for last ~12 years first on Horowitz and then SKS. I have never, ever had
db corruptions with SKS either.

However, it is exclusively running under Solaris and I'm not using
precompiled packages. And I tend to run db_recover before starting sks
processes, as a precaution.

--
NAME:Dinko.kreator.Korunic   DISCLAIMER:Standard.disclaimer.applies
ICQ:16965294        JAB:[hidden email]        PGP:0xEA160D0B
HOME:http://dkorunic.net    QUOTE:Eat.right.stay.fit.and.die.anyway

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ptree corruption

Kim Minh Kaplan
In reply to this post by Javier Henderson-3
Javier Henderson writes:

> I run two keyservers, one has just two peers, the other has about a dozen.
>
> The one with a dozen peers sees periodic corruption, necessitating a rebuild. From what I've seen after asking the great google mind, this seems to happen periodically, and there was even a posting this morning by someone dropping out of the ring because of this problem.
>
> Both of the keyservers I run are on 1.1.1, which I believe is latest and greatest.
>
> Any thoughts on this?

I used to have some similar problems[1] but since I moved to new
hardware (new housing really) I am now exempt from any such thing.
Since moving (summer 2008) I have not had any SKS corruption.  It has
always been running on a stable Debian, but SKS compiled from latest
sources.

Kim Minh.

[1] see this thread:
http://lists.nongnu.org/archive/html/sks-devel/2007-11/msg00002.html
--
Kim Minh

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ptree corruption

Javier Henderson-3

On May 12, 2010, at 1:54 PM, Kim Minh Kaplan wrote:

> Javier Henderson writes:
>
>> I run two keyservers, one has just two peers, the other has about a dozen.
>>
>> The one with a dozen peers sees periodic corruption, necessitating a rebuild. From what I've seen after asking the great google mind, this seems to happen periodically, and there was even a posting this morning by someone dropping out of the ring because of this problem.
>>
>> Both of the keyservers I run are on 1.1.1, which I believe is latest and greatest.
>>
>> Any thoughts on this?
>
> I used to have some similar problems[1] but since I moved to new
> hardware (new housing really) I am now exempt from any such thing.
> Since moving (summer 2008) I have not had any SKS corruption.  It has
> always been running on a stable Debian, but SKS compiled from latest
> sources.

I am running it on FreeBSD 8.0-RELEASE, I wonder if some bdb values need to be tweaked somewhat.

-jav


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel